Categories
Course System Safety

The Safety Artisan is on Thinkific

I’m pleased to tell you that The Safety Artisan is on Thinkific!

Thinkific is a powerful and beautifully-presented online Learning Management System.  This will complement the existing Safety Artisan website.  

My first course will be ‘System Safety Assessment‘ with ten hours of instructional videos. The new course is here.

(Please note that this is the same course as my ‘Complete System Safety Analysis Bundle’ of 12 videos available here.  So, if you’ve already bought that – thanks very much – please don’t buy it again, as you already have all the material.)

What will the System Safety Assessment Course do for you?

Transcript of the Video

Read the Transcript Here:

Welcome to the System Safety Assessment course

In this course, you will gain knowledge, skills, and confidence.  You will gain knowledge of what is involved in system safety assessment.  The individual tasks and techniques you need to carry out.

But more importantly, how to put them together into a successful program and how to tailor all these different tasks keeping some, but leaving out others so that you get an efficient and effective safety program, no matter what application or what system you are working with.

So that’s the knowledge and the skills

You’ll also get the confidence to be able to get you started.  Now, there is no substitute for live face-to-face training and coaching.  But this format is much more accessible to you and much more reasonably priced.  So wherever you are in the world, whatever time and day you want to do your learning, you can access this course and you can gain confidence to get you started.

So if you’re worried about a job interview, what you’re going to say or you’re worried about how to do a job and there’s nobody around to help you.  Then this course will give you the confidence to get started and to be aware of the pitfalls before you begin.

So what makes me confident that I can help you?

Well, first of all, I’ve got 25 years of experience applying system safety.

And I’ve done that in the UK, in the United States, in Australia, and in the European Union.  I’ve seen a wide variety of legal jurisdictions that I’ve worked in.  Also, I’ve worked on a wide variety of systems.  I’ve worked on planes, trains, ships and submarines, software, and I.T. systems all kinds of stuff.

I’ve worked on some gigantic multibillion-dollar projects and some much smaller ones.  So I know how to pragmatically apply this stuff, at a reasonable scale without spending stupid amounts of money.

And in fact, as part of my job as a consultant, I spent half the time telling clients to do less and spend less and still get an effective result.  So that’s where I’m coming from.

I’ve also got experience teaching system safety in the classroom.  I’ve taught hundreds of students, from various different projects.  And now I have hundreds of online students, and I’m very pleased to be able to help all of those as well.

So that’s why I think that I can help you

And I hope that you will enjoy this course and get a lot out of it.  Thanks very much for considering The Safety Artisan.

What do you think of the new page?

Categories
Blog Safety Management

Risk Management 101

Welcome to Risk Management 101, where we’re going to go through these basic concepts of risk management. We’re going to break it down into the constituent parts and then we’re going to build it up again and show you how it’s done. I’ve been involved in risk management, in project risk management, safety risk management, etc., for a long, long time.  I hope that I can put my experience to good use, helping you in whatever you want to do with this information.

Maybe you’re getting an interview. Maybe you want to learn some basics and decide whether you want to know more about risk management or not.  Whatever it might be, I think you’ll find this short session really useful. I hope you enjoy it and thanks for watching.

Welcome to Risk Management 101, where we’re going to…

Risk Management 101, Topics

  • Hazard Identification;
  • Hazard Analysis;
  • Risk Estimation;
  • Risk [and ALARP] Evaluation;
  • Risk Reduction; and
  • Risk Acceptance.

Risk Management 101, Transcript

Click here for the full transcript:

Introduction

Hi everyone and welcome to Risk Management 101. We’re going to go through these basic concepts of risk management. We’re going to break it down into the constituent parts. Then we’re going to build it up again and show you how it’s done.

My name is Simon Di Nucci and I have a lot of experience working in risk management, project risk management, safety risk management, etc.  I’m hoping that I can put my experience to good use, helping you in whatever you want to do with this information. Whether you’re going for an interview or you want to learn some basics. You can watch this video and decide if you want to know more about risk management or you don’t need to.  Whatever it might be, you’ll find this short session useful. I hope you enjoy it and thanks for watching.

Topics For This Session

Risk Management 101. So what does it all mean? We’re going to break risk management down into we’ve got six constituent parts. I’m using a particular standard that breaks it down this way. Other standards will do this in different ways. We’ll talk about that later. Here we’ve got risk management broken down in to; hazard identification, hazard analysis, risk estimation, risk evaluation (and ALARP), risk reduction, and risk acceptance.

Risk Management

Let’s get right on to that. Risk management – what is it? It’s defined as “the systematic application of management policies, procedures and practises to the tasks of hazard identification, hazard analysis, risk estimation, risk and ALARP evaluation, risk reduction, and risk acceptance”.

There are a couple of things to note here. We’re talking about management policies, procedures and practices. The ‘how’ we do it. Whether it’s a high-level policy or low-level common practice. E.g. how things are done in our organisation vs how the day-to-day tasks are done? And it’s also worth saying that when we talk about ‘hazards’, that’s a safety ‘ism’. If we were doing security risk management, we can be talking about ‘threats’. We can also be talking about ‘causes’ in day-to-day language. So, we can be talking about something causing a risk or leading to a risk. More on that later, but that’s an overview of what risk management is.

Part 1

Let’s look at it in a different way. For those of you who like a visual representation, here is a graph of the hierarchical breakdown. They need to happen in order, more-or-less, left to right. And as you can see, there’s a link between risk evaluation and risk reduction. We’ll come on to that. So, it’s not ‘or’ it’s a serial ‘this is what you have to do’. Sometimes they’re linked together more intimately.

Hazard Identification

First of all, hazard identification. So, this is the process where we identify and list hazards and accidents associated with the system. You may notice that some words here are in bold. Where a word is in bold, we are going to give the definition of what it is later.

These hazards could lead to an accident but only associated with the system. That’s the scope. If we were talking about a system that was an aeroplane, or a ship, or a computer, we would have a very different scope. There would also be a different way that maybe accidents would happen.

On a more practical level, how do we do hazard identification? I’m not going to go into any depth here, but there are certain classic ones. We can consult with our workers and inspect the workplace where they’re operating. And in some countries, that’s a legal requirement (Including in Australia where I live). Another option is we can look at historical data. And indeed, in some countries and in some industries, that’s a requirement. A requirement means we have to do that. And we can use special analysis techniques. Now, I’m not going to talk about any of those analysis techniques today. You can watch some other sessions on The Safety Artisan to see that.

Hazard Analysis

Having done hazard identification, we’ve asked ourselves ‘What could go wrong?’. We can put some more detail on and ask, ‘How could it go wrong? And how often?’. That kind of stuff. So, we want to go into more detail about the hazards and accidents associated with this particular system. And that will help us to define some accident sequences. We can start with something that creates a hazard and then the hazard may lead to an accident. And that’s what we’re talking about. We will show that using graphics late, which will be helpful.

But again, more on terminology. In different industries, we call it different things. We tend to say ‘accident’ in the UK and Australia. In the U.S., they might call it a ‘mishap’, which is trying to get away from the idea that something was accidental. Nobody meant it to happen. Mishap is a more generic term that avoids that implication. We also talk about ‘losses’ or we talk about ‘breaches’ in the security world. We have some issue where somebody has been able to get in somewhere that they should not. And we can talk about accident sequences. Or, in a more common language, we call it a sequence of events. That’s all it is.

Risk Estimation

Now we’re talking about the risk estimation. We’ve thought about our hazards and accidents and how they might progress from one to another. Let’s think about, ‘How big is the risk of this actually happening?’. Again, we’ll unpack this further later at the next level. But for now, we’re going to talk about the systematic use of available information. Systematic- so, ordered. We’re following a process. This isn’t somebody on their own taking a subjective view ‘Look, I think it’s not that’. It’s a process that is repeatable. We want to do something systematic. It’s thorough, it’s repeatable, and so it’s defendable. We can justify the conclusions that we’ve come to because we’ve done it with some rigour. We’ve done it in a systematic way. That’s important. Particularly if we’re talking about harm coming to people or big losses.

Risk and ALARP Evaluation

Now, risk evaluation is just taking that estimated risk just now and comparing it to something and saying, “How serious is this risk?”. Is it something that is very low? If it’s very insignificant then we’re not bothered about it. We can live with it. We can accept it. Or is it bigger than that? Do we need to do something more about it? Again, we want to be systematic. We want to determine whether risk reduction is necessary. Is this acceptable as it is or is it too high and we need to reduce it? That’s the core of risk evaluation.

In this UK-based standard – we’re using terminology is found in different forms around the world. But in the UK, they talk about ‘tolerability’. We’re talking about the absolute level of risk. There probably is an upper limit that’s allowed in the law or in our industry. And there’s a lower limit that we’re aiming for. In an ideal world, we’d like all our risks to be low-level risks. That would be terrific.

So, that’s ‘tolerability’. And you might hear it called different things. And then within the UK system, there’re three classes of ‘tolerability’ at risk. We could say it’s either ‘broadly acceptable’- it’s very low. It’s down in the target region where we like to get all our risks. It’s ‘tolerable’- we can expose people to this risk or we can live with this risk, but only if we’ve met certain other criteria. And then there’s the risk that it’s so big. It’s so far up there, we can’t do that. We can’t have that under any circumstances. It’s unacceptable. You can imagine a traffic light system where we have categorised our risk.

And then there’s the test of whether our risk can be accepted in the UK. It’s called ALARP. We reduce the risk As Low As Reasonably Practicable. And in other places, you’ll see SFARP. We’ve eliminated or minimised the risk So Far As Is Reasonably Practicable. In the nuclear industry, they talk about ALARA: As Low As Reasonably Achievable. And then different laws use different tests. Whichever one you use, there’s a test that we have got to use to say, “Can we accept the risk?” “Have we done enough risk reduction?”. And whatever you’ve put in those square brackets, that’s the test that you’re using. And that will vary from jurisdiction to jurisdiction. The basic concept of risk evaluation is estimating the level of risk. Then compare it to some standard or some regulation. Whatever one it might be, that’s what we do. That’s risk evaluation.

Risk Reduction

We’ve asked, “Do we need to reduce risk further?”. And if we do, we need to do some risk reduction. Again, we’re being systematic. This is not some subjective thing where we go “I have done some stuff, it’ll be alright. That’s enough.”. We’re being a bit more rigorous than that. We’ve got a systematic process for reducing risk. And in many parts of the world, we’re directed to do things in a certain way.

This is an illustration from an Australian regulation. In this regulation, we’re aiming to eliminate risk. We want to start with the most effective risk reduction measures. Elimination is “We’ve reduced the risk to zero”. That would be lovely if we could do that but we can’t always do that.

What’s the next level? We could get rid of this risk by substituting something less risky. Imagine we’ve got a combustion engine powering something. The combustion engine needs flammable fuel and it produces toxic fumes. It could release carbon monoxide and CO2 and other things that we don’t want. We ask, “Can we get rid of that?”. Could we have an electric motor instead and have a battery instead? That might be a lot safer than the combustion engine. That is a substitution. There are still risks with electricity. But by doing this we’ve substituted something risky for something less risky.

Or we could isolate the hazard. Let’s use the combustion engine as an example again. We can say, “I’ll put that in the fuel and the exhaust somewhere, a long way from people”. Then it’ll be a long way from where it can do harm or cause a loss.” And that’s another way of dealing with it.

Or we could say, “I’m going to reduce the risks through engineering controls”. We could put in something engineered. For example, we can put in a smoke detector. A very simple, therefore highly reliable, device. It’s certainly more reliable than a human. You can install one that can detect some noxious gases. It’s also good if it’s a carbon monoxide detector. Humans cannot detect carbon monoxide at all. (Except if you’ve got carbon monoxide poisoning, you’ll know about it. Carbon monoxide poisoning gives you terrible headaches and other symptoms.) But of course, that’s not a good way to detect that you’re breathing in poisonous gas. We do not want to do it that way.

So, we can have an engineering control to protect people. Or we can an interlock. We can isolate things in a building or behind a wall or whatever. And if somebody opens the door, then that forces the thing to cut out so it’s no longer dangerous. There are different things for engineering controls that we can introduce. They do not rely on people. They work regardless of what any person does.

Next on the list, we could reduce exposure to the hazard by using administrative controls. That’s giving somebody some rules to follow a procedure. “Do this. Don’t do that.” Now, that’s all good. We can give people warning signs and warn people not to approach something. But, of course, sometimes people break the rules for good reasons. Maybe they don’t understand. Maybe they don’t know the danger. Maybe they’ve got to do something or maybe the procedure that we’ve given them doesn’t work very well. It’s too difficult to get the job done, so people cut corners. So, procedural protection can be weak. And a bit hit and miss sometimes.

And then finally, we can give people personal protective equipment. We can give them some eye protection. I’m wearing glasses because I’m short-sighted. But you can get some goggles to protect your eyes from damage. Damage like splashes, flying fragments, sparks, etc. We can have a hard hat so that if we’re on a building site and something drops from above on us that protects the old brain box. It won’t stop the accident from happening, but it will help reduce the severity of the accident. That’s the least effective. We’re doing nothing to prevent the accident from happening. We’re reducing the severity in certain circumstances. For example, if you drop a ton of bricks on me, it doesn’t matter whether I’m wearing a hard hat or not. I’m still going to get crushed. But with one brick, I should be able to survive that if I’m wearing a hard hat.

Risk Acceptance

Let’s move on to risk acceptance. At some stage, if we have reduced the risk to a point where we can accept it. We can live with it and we’ve decided that we’re going to need to do whatever it is that is exposing us to the risk. We need to use the system. We want to get in our car to enable us to go from a to b quickly and independently. So, we’re going to accept the risk of driving in our car. We’ve decided we’re going to do that. We make risk acceptance decisions every day, often without thinking about it. We get in a car every day on average and we don’t worry about the risk, but it’s always there. We’ve just decided to accept it.

But in this example we’ve got, it’s not an individual deciding to do something on the spur of the moment. Nor is it based on personal experience. We’ve got a systematic process where a bunch of people come together. The relevant stakeholders agree that a risk has been assessed or has been estimated and has been evaluated. They agree that the risk reduction is good enough and that we will accept that risk. There’s a bit more to it than you and I saying, “That’ll be alright.”

Part 2

Let’s summarise where we’ve got to. We’ve talked about these six components of risk management. That’s terrific. And as you can see, they all go together. Risk evaluation and risk reduction are more tightly coupled. That’s because when we do some risk reduction, we then re-evaluate the risk. We ask ‘Can we accept it?’. If the answer is ‘No.’ we need to do some more work. Then we do some more risk reduction. So those tend to be a bit more coupled together at the end. That’s the level we’ve got to. We’re now going to go to the next level.

So, we’re going to explain these things. We’ve talked about hazard identification and hazard analysis, but what is a hazard? And what is an accident? And what is an accident sequence? We’re going to unpack that a bit more. We’re going to take it to the next level. And throughout this, we’re talking about risk over and over again. Well, what is ‘risk’? We’re going to unpack that to the next level as well. It all comes down to this anyway. This is a safety standard. We’re talking about harm to people. How likely is that harm and how severe might it be? But it might be something else. It might be a loss or a security breach. It might be a financial loss. It might be a negative result for our project. We might find ourselves running late. Or we’re running over budget. Or we’re failing to meet quality requirements. Or we’re failing to deliver the full functionality that we said we would. Whatever it might be.

Hazard

So, let’s unpack this at the next level. A hazard is a term that we use, particularly in safety. As I say, we call it other things in different realms. But in the safety world, it’s a physical situation or it’s a state of a system. And as it says, it often follows from some initiating event which we may call a ‘cause’. And the hazard may lead to an accident. And the key thing to remember is once a hazard exists, an accident is possible, but it’s not certain. You can imagine the sort of cartoon banana skin on the pavement gag. Well, the banana skin is the hazard. In the cartoon, the cartoon character always steps on the banana skin. They always fall over the comic effect. But in the real world, nobody may tread on the banana skin and slip over. There could be nobody there to slip over all the banana skin. Or even if somebody does, they could catch themselves. Or they fall, but it’s on a soft surface and they don’t hurt themselves so there’s no harm.

So, the accident isn’t certain. And in fact, we can have what we call ‘non-accident’ outcomes. We can have harmless consequences. A hazard is an important midway step. I heard it called an accident waiting to happen, which is a helpful definition. An accident waiting to happen, but it doesn’t mean that the accident is inevitable.

Accident

But the accident can happen. Again, the ‘accident’, ‘mishap’, or ‘unintended event’. Something we did not want or a sequence of events that causes harm. And in this case, we’re talking about harm to people. And as I say, it might be a security breach. It might be a financial loss. It might be reputational damage. Something might happen that is very embarrassing for an organisation or an individual. Or again, we could have a hiccup with our project.

Harm

But in this case, we’re talking about harm. And this kind of standard, we’re using what you might call a body count approach to the harm. We’re talking about actual death, physical injury, or damage to the health of people. This standard also considers the damage to property and the environment. Now, very often we are legally required to protect people and the environment from harm. Property less so. But there will be financial implications of losses of property or damage to the systems. We don’t want that. But it’s not always criminally illegal to do that. Whereas usually, hurting people and damaging the environment is. So, this is ‘harm’. We do not want this thing to happen. We do not want this impact. Safety is a much tougher business in this instance. If we have a problem with our project, it’s embarrassing but we could recover it. It’s more difficult to do that when we hurt somebody.

Risk

And always in these terms, we’re talking about ‘risk’. What is ‘risk’? Risk is a combination of two things. It’s a combination of the likelihood of harm or loss and the severity of that harm or loss. It’s those two things together. And we’ve got a very simple illustration here, a little table. And they’re often known as a risk matrix, but don’t worry about that too much. Whatever you want to call it. We’ve got a little two by two table here and we’ve got likelihood in the white text and severity in the black. We can imagine where there’s a risk where we have a low likelihood of a ‘low harm’ or a ‘low impact’ accident or outcome. We say, ‘That’s unlikely to happen and even if it does not much is going to happen.’ It’s going to be a very small impact. So, we’d say that that’s a low risk.

Then at the other end of the spectrum, we can imagine something that has a high likelihood of happening. And that likelihood also has a high impact. Things that happen that we definitely do not want to happen. And we say, ‘That’s a high risk and that’s something that we are very, very concerned about.’

And then in the middle, we could have a combination of an outcome that is quite likely, but it’s of low severity. Or it’s of high severity, but it’s unlikely to happen. And we say, ‘That’s a medium risk’.

Now, this is a very simplified matrix for teaching purposes only. In the real world, you will see matrices that four by four, or five by five, or even six by six, or combinations thereof. And in security where they talk about threat and vulnerability and the outcomes. Here, you might see multiple matrices used. They use multiple matrices to progressively build up a picture of the risk. They use matrices as building blocks. So, it may not be only one matrix used in a more complex thing you’ve got to model. But here we’ve got a nice, simple example. This illustrates what risk is. It’s a combination of severity and likelihood of harm or loss. And that’s what risk is, fundamentally. And if we have a firm grasp of these fundamentals, it’ll help us to reason and deal with almost anything. With enough application.

Accident Sequence

Now, let’s move on and talk about accident sequences. We’re talking about a progression in this case. We’re imagining a left-to-right path. A progression of events that results in an accident. This diagram, that looks like a bow tie, it’s meant to represent the idea that we can have one hazard. There might be many causes that lead to this hazard. There might be many different things that could create the hazard or initiate the hazard. And the hazard may have many different consequences.

As I’ve said before, nothing at all may happen. That might be the consequence of the hazard. Most of the time that’s what’s going to happen. But there may be a variety of consequences. Somebody might get a minor injury or there might be a more serious accident where one or more people are killed. A good example of this is fire. So, the hazard is the fire. The causes might be various. We could be dealing with flammable chemicals, or a lightning strike, or an electricity arc flash. Or we could be dealing with very high temperatures where things spontaneously burst into flames. Or we could have a chemical in the presence of pure oxygen. Some things will spontaneously burst into flames in the presence of pure oxygen. So there’re a variety of causes that lead to the fire.

And the fire might be very small and burn itself out. It causes very little damage and nobody gets hurt. Or it might lead to a much bigger fire that, in theory, could kill lots of people. So, there’s a huge range of consequences potentially from one hazard. But the accident sequence is how we would describe and capture this progression. From initiating events to the hazard to the possible consequences. And by modelling the accident sequence, of course, we can think about how we could interrupt it.

Part 3

We’ve broken risk management down into those six constituent parts. We’ve gone to the next level, in that we’ve sort of gone down to the concepts that underpin these things. These hazards, the accidents, and the accident sequence. We’ve talked about risk itself and what we don’t want to happen. The harm, the loss, the financial loss, the embarrassment, the failed or late or budget project, a security breach, the undesired event, etc. We had an objective which was to do something safely or to complete a project and the risk is that that won’t happen. That there’ll be an impact on what we were trying to do that is negative. That is undesirable.

There are just only more concepts that we need to look at to complete the pattern, as you can see. We’ve been talking about the system. And we’ve been talking about doing things systematically. And then a system works in an operating environment. So, let’s unpack that.

System

First of all, we have a system. The system is going to be a combination of things. I wouldn’t call a pen or a pencil a system. It’s only got a couple of components. You could pull it apart. But it’s too simple to be worth calling it a system. We wouldn’t call it a pen system, would we? So, a system is something more complex. It’s a combination of things and we need to define the boundary. I’ll come back to that.

But within this boundary, we’ve got some different elements in the system that work together. Or they’re used together within a defined operating environment. So, we’re going to expose this system to a range of conditions which it is designed to usually work in. The intention is the system is going to do whatever it does to perform a given task. It can do one defined task or achieve a specific purpose. I talked before about getting in our car. A car is complex enough to be called a system. We get in our car and we drive it on the roads. Or if we’ve got a four-wheel drive, we can drive Off-Road. Or we can use it in a more demanding operating environment to achieve a specific purpose. We want to transport ourselves, and sometimes some stuff, from A to B. That’s what we’re trying to do with the system.

And within that system, we may have personnel/people, we may have procedures. A bunch of rules about how you drive a car legally in different countries. We’ve got materials and physical things – what the car is made of. We could have tools to repair it, change wheels. We’ve got some other equipment, like a satnav. We’ve got facilities. We need to take a car somewhere to fill up with fuel or to recharge it. We’ve got services like garages, repairs, servicing, etc. And there could be some software in there as well. Of course, these days in the car, there’s software everywhere in most complex devices.

So, our system is a combination of lots of different things. These things are working together to achieve some kind of goal or some kind of result. There’s somewhere we want to get to. And it’s designed to work in a particular operating environment. Cars work on roads really well. Off-road cars can work on tracks. Put them in deep water, they tend not to work so well. So, let’s talk about that operating environment.

Operating Environment

What we’ve got here, the total set of all external, natural, and induced conditions. (That’s external to the system, so outside the boundary.) So, it might be these conditions-. It might be natural or it might be generated by something else, which a system is exposed to at any given moment. And we need to get a good understanding of the system, the operating environment, and what we want it to do.

If we have a good understanding of those three things, then we will be well on the way to being able to understand the risks associated with that system. That’s one of the key things with risk management. If you’ve got those three things, that’s crucial. You will not be able to do effective risk management if you don’t have a grasp of those things. And if you do have a thorough grasp of those things, it’s going to help you do effective risk management.

Conclusion

So, we’ve talked about risk management. We’ve broken it down into some big sections. Those six sections; the hazard identification; analysis; risk estimation; evaluation; reduction; and acceptance. We’ve seen how those things depend on only a few concepts. We’ve got the concepts of ‘hazards’, ‘risks’, and ‘accidents’. As well as the undesirable consequences that the risk might result in. And the risk is measured based on the likelihood and severity of that harm or that loss occurring.

And when we’re dealing with a more complex system, we need to understand that system and the environment in which it operates. And of course, we’ve put it in that environment for a purpose. And that unpacking has allowed us to break down quite a big concept, risk management. A lot of people, like myself, spend years and years learning how to do this. It takes time to gain experience because it’s a complex thing. But if we break it down, we can understand what we’re doing. We can work our way down the fundamentals. And then if we’ve got a good grasp of the fundamentals, that supports getting the more complex stuff right. So, that’s what risk management is all about. That’s your risk management 101 and I hope that you find that helpful.

Copyright Statement

I just need to say briefly that those quotations from the standard. I can do that under a Creative Commons licence. The CC4.0. That allows me to do that within limits that I am careful to observe. But this video presentation is copyright the Safety Artisan.

For More…

And you can see more like these at the Safety Artisan website. That’s www.safetyartisan.com. And as you can see, it’s a secure site so you can visit without fear of a security breach. So, do head over there. Subscribe to the monthly newsletter to get discounts on paid videos and regular updates of what’s coming up. both paid and free.

So, it just remains for me to say thanks very much for watching and I look forward to catching up with you again very soon.

End of Risk Management 101

This session can also be found at Udemy.com along with more advanced courses like this one. For more introductory sessions on this site start here.

Categories
Mil-Std-882E Safety Analysis System Safety

How to Understand Safety Standards

Learn How to Understand Safety Standards with this FREE session from The Safety Artisan.

In this module, Understanding Your Standard, we’re going to ask the question: Am I Doing the Right Thing, and am I Doing it Right? Standards are commonly used for many reasons. We need to understand our chosen system safety engineering standard, in order to know: the concepts, upon which it is based; what it was designed to do, why and for whom; which kinds of risk it addresses; what kinds of evidence it produces; and it’s advantages and disadvantages.

Understand Safety Standards : You’ll Learn to

  • List the hazard analysis tasks that make up a program; and
  • Describe the key attributes of Mil-Std-882E. 
Understanding Your Standard

Topics:  Understand Safety Standards

Aim: Am I Doing the Right Thing, and am I Doing it Right?

  • Standards: What and Why?
  • System Safety Engineering pedigree;
  • Advantages – systematic, comprehensive, etc:
  • Disadvantages – cost/schedule, complexity & quantity not quality.

Transcript: Understand Safety Standards

Click here for the Transcript on Understanding Safety Standards

In Module Three, we’re going to understand our Standard. The standard is the thing that we’re going to use to achieve things – the tool. And that’s important because tools designed to do certain things usually perform well. But they don’t always perform well on other things. So we’re going to ask ‘Are we doing the right thing?’ And ‘Are we doing it right?’

What and Why?

So, what are we going to do, and why are we doing it? First of all, the use of standards in safety is very common for lots of reasons. It helps us to have confidence that what we’re doing is good enough. We’ve met a standard of performance in the absolute sense. It helps us to say, ‘We’ve achieved standardization or commonality in what we’re doing’. And we can also use it to help us achieve a compromise. That can be a compromise across different stakeholders or across different organizations. And standardization gives us some of the other benefits as well. If we’re all doing the same thing rather than we’re all doing different things, it makes it easier to train staff. This is one example of how a standard helps.

However, we need to understand this tool that we’re going to use. What it does, what it’s designed to do, and what it is not designed to do. That’s important for any standard or any tool. In safety, it’s particularly important because safety is in many respects intangible. This is because we’re always looking to prevent a future problem from occurring. In the present, it’s a little bit abstract. It’s a bit intangible. So, we need to make sure that in concept what we’re doing makes sense and is coherent. That it works together. If we look at those five bullet points there, we need to understand the concept of each standard. We need to understand the basis of each one.

And they’re not all based on the same concept. Thus some of them are contradictory or incompatible. We need to understand the design of the standard. What the standard does, what the aim of the standard is, why it came into existence. And who brought it into existence. To do what for who – who’s the ultimate customer here?

And for risk analysis standards, we need to understand what kind of risks it addresses. Because the way you treat a financial risk might be very different from a safety risk. In the world of finance, you might have a portfolio of products, like loans. These products might have some risks associated with them. One or two loans might go bad and you might lose money on those. But as long as the whole portfolio is making money that might be acceptable to you. You might say, ‘I’m not worried about that 10% of my loans have gone south and all gone wrong. I’m still making plenty of profit out of the other 90%’. It doesn’t work that way with safety. You can’t say ‘It’s OK that I’ve killed a few people over here because all this a lot over here are still alive!’. It doesn’t work like that!

Also, what kind of evidence does the standard produce? Because in safety, we are very often working in a legal framework that requires us to do certain things. It requires us to achieve a certain level of safety and prove that we have done so. So, we need certain kinds of evidence. In different jurisdictions and different industries, some evidence is acceptable. Some are not. You need to know which is for your area.

And then finally, let’s think about the pros and cons of the standard, what does it do well? And what does it do not so well?

System Safety Pedigree

We’re going to look at a standard called Military Standard 882E. Many decades ago, this standard developed was created by the US government and military to help them bring into service complex-cutting edge military equipment. Equipment that was always on the cutting edge. That pushed the limits of what you could achieve in performance.

That’s a lot of complexity. Lots of critical weapon systems, and so forth. And they needed something that could cope with all that complexity. It’s a system safety engineering standard. It’s used by engineers, but also by many other specialists. As I said, it’s got a background from military systems. These days you find these principles used pretty much everywhere. So, all the approaches to System Safety that 882 introduced are in other standards. They are also in other countries.

It addresses risks to people, equipment, and the environment, as we heard earlier. And because it’s an American standard, it’s about system safety. It’s very much about identifying requirements. What do we need to happen to get safety? To do that, it produces lots of requirements. It performs analyses in all those requirements and generates further requirements. And it produces requirements for test evidence. We then need to fulfill these requirements. It’s got several important advantages and disadvantages. We’re going to discuss these in the next few slides.

Comprehensive Analysis

Before we get to that, we need to look at the key feature of this standard. The strengths and weaknesses of this standard come from its comprehensive analysis. And the chart (see the slide) is meant to show how we are looking at the system from lots of different perspectives. (It’s not meant to be some arcane religious symbol!) So, we’re looking at a system from 10 different perspectives, in 10 different ways.

Going around clockwise, we’ve got these ten different hazard analysis tasks. First of all, we start off with preliminary hazard identification. Then preliminary hazard analysis. We do some system requirements hazard analysis. So, we identify the safety requirements that the system is going to meet so that we are safe. We look at subsystem and system hazard analysis. At operating and support hazard analysis – people working with the system. Number seven, we look at health hazard analysis – Can the system cause health problems for people? Functional hazard analysis, which is all about what it does. We’re thinking of sort of source software and data-driven functionality. Maybe there’s no physical system, but it does stuff. It delivers benefits or risks. System of systems hazard analysis – we could have lots of different and/or complex systems interacting. And then finally, the tenth one – environmental hazard analysis.

If we use all these perspectives to examine the system, we get a comprehensive analysis of the system. From this analysis, we should be confident that we have identified everything we need to. All the hazards and all the safety requirements that we need to identify. Then we can confidently deliver an appropriate safe system. We can do this even if the system is extremely complex. The standard is designed to deal with big, complex cutting-edge systems.

Advantages #1

In fact, as we move on to advantages, that’s the number one advantage of this standard. If we use it and we use all 10 of those tasks, we can cope with the largest and the most demanding programs. I spent much of my career working on the Eurofighter Typhoon. It was a multi-billion-dollar program. It cost hundreds of billions of dollars, four different nations worked together on it. We used a derivative of Mil. Standard 882 to look at safety and analyze it. And it coped. It was powerful enough to deal with that gigantic program. I spent 13 years of my life on and off on that program so I’d like to think that I know my stuff when we’re talking about this.

As we’ve already said, it’s a systematic approach to safety. Systems, safety, engineering. And we can start very early. We can start with early requirements – discovery. We don’t even need a design – we know that we have a need. So we can think about those needs and analyze them.

And it can cover us right through until final disposal. And it covers all kinds of elements that you might find in a system. Remember our definition of ‘system’? It’s something that consists of hardware, software, data, human beings, etc. The standard can cope with all the elements of a system. In fact, it’s designed into the standard. It was specifically designed to look at all those different elements. Then to get different insights from those elements. It’s designed to get that comprehensive coverage. It’s really good at what it does. And it involves, not just engineers, but people from all kinds of other disciplines. Including operators, maintainers, etc, etc.

I came from a maintenance background. I was either directly or indirectly supporting operators. I was responsible for trying to help them get the best out of their system. Again, that’s a very familiar world to me. And rigorous standards like this can help us to think rigorously about what we’re doing. And so get results even in the presence of great complexity, which is not always a given, I must say.

So, we can be confident by applying the standard. We know that we’re going to get a comprehensive and thorough analysis. This assures us that what we’re doing is good.

Advantages #2

So, there’s another set of advantages. I’ve already mentioned that we get assurance. Assurance is ‘justified confidence’. So we can have high confidence that all reasonably foreseeable hazards will be identified and analyzed. And if you’re in a legal jurisdiction where you are required to hit a target, this is going to help you hit that target.

The standard was also designed for use in contracts. It’s designed to be applied to big programs. We’d define that as where we are doing the development of complex high-performance systems. So, there are a lot of risks. It’s designed to cope with those risks.

Finally, the standard also includes requirements for contracting, for interfaces with other systems, for interfaces with systems engineering. This is very important for a variety of disciplines. It’s important for other engineering and technical disciplines. It’s important for non-technical disciplines and for analysis and recordkeeping. Again, all these things are important, whether it is for legal reasons or not. We need to do recordkeeping. We need to liaise with other people and consult with them. There are legal requirements for that in many countries. This standard is going to help us do all those things.

But, of course, in a standard everything has pros and cons and Mil. Standard 882 is no exception. So, let’s look at some of the disadvantages.

Disadvantages #1

First of all, a full system safety program might be overkill for the system that you want to use, or that you want to analyze.  The Cold War, thank goodness, is over; generally speaking, we’re not in the business of developing cutting-edge high-performance killing machines that cost billions and billions of dollars and are very, very risky. These days, we tend to reduce program risk and cost by using off-the-shelf stuff and modifying it. Whether that be for military systems, infrastructure in the chemical industry, transportation, whatever it might be. Very much these days we have a family of products and we reuse them in different ways. We mix and match to get the results that we want.

And of course, all this comprehensive analysis is not cheap and it’s not quick. It may be that you’ve got a program that is schedule-constrained. Or you want to constrain the cost and you cannot afford the time and money to throw a full 882 program at it. So, that’s a disadvantage.

The second family of problems is that these kinds of safety standards have often been applied prescriptively. The customer would often say, ‘Go away and go and do this. I’m going to tell you what to do based on what I think reduces my risk’. Or at least it covers their backside. So, contractors got used to being told to do certain things by purchasers and customers. The customers didn’t understand the standards that they were applying and insisting upon. So, the customers did not understand how to tailor a safety standard to get the result that they wanted. So they asked for dumb things or things that didn’t add value. And the contractors got used to working in that kind of environment. They got used to being told what to do and doing it because they wouldn’t get paid if they didn’t. So, you can’t really blame them.

But that’s not great, OK? That can result in poor behaviors. You can waste a lot of time and money doing stuff that doesn’t actually add value. And everybody recognizes that it doesn’t add value. So you end up bringing the whole safety program into disrepute and people treat it cynically. They treat it as a box-ticking exercise. They don’t apply creativity and imagination to it. Much less determination and persistence. And that’s what you need for a good effective system safety program. You need creativity. You need imagination. You need people to be persistent and dedicated to doing a good job. You need that rigor so that you can have the confidence that you’re doing a good job because it’s intangible.

Disadvantages #2

Let’s move onto the second kind of family of disadvantages. And this is the one that I’ve seen the most, actually, in the real world. If you do all 10 tasks and even if you don’t do all 10, you can create too many hazards. If you recall the graphic from earlier, we have 10 tasks. Each task looks at the system from a different angle. What you can get is lots and lots of duplication in hazard identification. You can have essentially the same hazards identified over and over again in each task. And there’s a problem with that, in two ways.

First of all, quality suffers. We end up with a fragmented picture of hazards. We end up with lots and lots of hazards in the hazard log, but not only that. We get fragments of hazards rather than the real thing. Remember I said those tests for what a hazard really is? Very often you can get causes masquerading as hazards. Or other things that that exacerbating factors that make things worse. They’re not a hazard in their own right, but they get recorded as hazards. And that problem results in people being unable to see the big picture of risk. So that undermines what we’re trying to do. And as I say, we get lots of things misidentified and thrown into the pot. This also distracts people. You end up putting effort into managing things that don’t make a difference to safety. They don’t need to be managed. Those are the quality problems.

And then there are quantity problems. And from personal experience, having too many hazards is a problem in itself.  I’ve worked on large programs where we were managing 250 hazards or thereabouts. That is challenging even with a sizable, dedicated team. That is a lot of work in trying to manage that number of hazards effectively. And there’s always the danger that it will slide into becoming a box-ticking exercise. Superficial at best.

I’ve also seen projects that have two and a half thousand hazards or even 4000 hazards in the hazard log. Now, once you get up to that level, that is completely unmanageable. People who have thousands of hazards in a hazard log and they think they’re managing safety are kidding themselves. They don’t understand what safety is if they think that’s going to work. So, you end up with all these items in your hazard log, which become a massive administrative burden. So people end up taking shortcuts and the real hazards are lost. The real issues that you want to focus on are lost in the sea of detail that nobody will ever understand. You won’t be able to control them.

Unfortunately, Mil. Standard 882 is good at generating these grotesque numbers of hazards. If you don’t know how to use the standard and don’t actively manage this issue, it gets to this stage. It can go and does go, badly wrong. This is particularly true on very big programs. And you really need clarity on big projects.

Summary of Module

Let’s summarize what we’ve done with this module. The aim was to help us understand whether we’re doing the right thing and whether we’ve done it right. And standards are terrific for helping us to do that. They help us to ensure we’re doing the right thing. That we’re looking at the right things. And they help us to ensure that we’re doing it rigorously and repeatedly. All the good quality things that we want. And Mil. Standard 882E that we’re looking at is a system safety engineering standard. So it’s designed to deal with complexity and high-performance and high-risk. And it’s got a great pedigree. It’s been around for a long time.

Now that gives advantages. So, we have a system safety program with this standard that helps us to deal with complexity. That can cope with big programs, with lots of risks. That’s great.

The disadvantages of this standard are that if we don’t know how to tailor or manage it properly, it can cost a lot of money. It can take a lot of time to give results which can cause problems for the program. And ultimately, you can accidentally ignore safety if you don’t deliver on time. And it can generate complexity. And it can generate a quantity of data that is so great that it actually undermines the quality of the data. It undermines what we’re trying to achieve. In that, we get a fragmented picture in which we can’t see the true risks. And so we can’t manage them effectively. If we get it wrong with this standard, we can get it really wrong. And that brings us to the end of this module.

This is Module 3 of SSRAP

This is Module 3 from the System Safety Risk Assessment Program (SSRAP) Course. Risk Analysis Programs – Design a System Safety Program for any system in any application. You can access the full course here.

You can find more introductory lessons at Start Here.

Categories
Mil-Std-882E Safety Analysis

System Safety Risk Assessment

Learn about System Safety Risk Assessment with The Safety Artisan.

In this module, we’re going to look at how we deal with the complexity of the real world. We do a formal risk analysis because real-world scenarios are complex. The Analysis helps us to understand what we need to do to keep people safe. Usually, we have some moral and legal obligation to do it as well. We need to do it well to protect people and prevent harm to people.

You Will Learn to:

  • Explain what a system safety approach is and does; and
  • Define what a risk analysis program is; 
System Safety Risk Analysis.

Topics: System Safety Risk Assessment

Aim: How do we deal with real-world complexity?

  • What is System Safety?
  • The Need for Process;
  • A Realistic, Useful, Powerful process:
    • Context, Communication & Consultation; and
    • Monitoring & Review, Risk Treatment.
  • Required Risk Reduction.

Transcript: System Safety Risk Assessment

Click here for the Transcript on System Safety Risk Assessment

In this module, on System Safety Risk Assessment, we’re going to look at how we deal with the complexity of the real world. We do a formal risk analysis because real-world scenarios are complex. The Analysis helps us to understand what we need to do to keep people safe. Usually, we have some moral and legal obligation to do it as well. We need to do it well to protect people and prevent harm to people.

What is System Safety?

To start with, here’s a little definition of system safety. System safety is the application of engineering and management principles, criteria, and techniques to achieve acceptable risk within a wider context. This wider context is operational effectiveness – We want our system to do something. That’s why we’re buying it or making it. The system has got to be suitable for its use. We’ve got some time and cost constraints and we’ve got a life cycle. We can imagine we are developing something from concept, from cradle to grave.

And what are we developing? We’re developing a system. An organization of hardware, (or software) material, facilities, people, data and services. All these pieces will perform a designated function within the system. The system will work within a stated or defined operating environment. It will work with the intention to produce specified results.

We’ve got three things there. We’ve got a system. We’ve got the operating environment within which it works- or designed to work. And we have the thing that it’s supposed to produce; its function or its application. Why did we buy it, or make, it in the first place? What’s it supposed to do? What benefits is it supposed to bring humankind? What does it mean in the context of the big picture?

That’s what a system is. I’m not going to elaborate on systems theory or anything like that. That’s a whole big subject on its own. But we’re talking about something complex. We’re not talking about a toaster. It’s not consumer goods. It’s something complicated that operates in the real world. And as I say, we need to understand those three things – system, environment, purpose – to work out Safety.

We Need A Process

We’ve sorted our context. How is all this going to happen? We need a process. In the standard that we’re going to look at in the next module, we have an eight-element process. As you can see there, we start with documenting our approach. Then we identify and document hazards. We document everything according to the standard so forget that.

We assess risk. We plan how we’re going to mitigate the risk. We identify risk mitigation measures or controls as there are often known. Then we apply those controls to reduce risk. We verify and confirm that the risk reduction that we have achieved, or that we believe we will achieve. And then we got to get somebody to accept that risk. In other words, to say that it is an acceptable level of risk. That we can put up with this level of risk in exchange for the benefits that the system is going to give us. Finally, we need to manage risk through the entire lifecycle of the system until we finally get rid of it.

The key point about this is whatever process we follow, we need to approach it with rigor. We stick to a systematic process. We take a structured and rigorous approach to looking at our system.

And as you can see there from the arrows, every step in the eight-element sequence flows into the next step. Each step supports and enables the following steps. We document the results as we go. However, even this example is a little bit too simple.

A More Realistic Process

So, let’s get a more realistic process. What we’ve got here are the same things we’ve had before. We’ve established the context at the beginning. Next, there’s risk assessment. Risk assessment consists of risk identification, risk analysis, and risk evaluation. It asks ‘Where are we?’ in relation to a yardstick or framework that categorizes risk. The category determines whether a risk is acceptable or not.

After determining whether the risk is acceptable or not, we may need to apply some risk treatment. Risk Treatment will reduce the risk further. By then we should have the risk down to an acceptable level.

So, that’s the straight-through process, once through. In the real world, we may have to go around this path several times. Having treated the risk over a period of time, we need to monitor and review it. We need to make sure that the risk turns out, in reality, to be what we estimated it to be. Or at least no worse. If it turns out to be better- Well, that’s great!

And on that monitoring and review cycle, maybe we even need to go back because the context has changed. These changes could include using the system to do something it was not designed to do. Or modifying the system to operate in a wider variety of environments. Whatever it might be, the context has changed. So, we need to look again at the risk assessment and go round that loop again.

And while we’re doing all that, we need to communicate with other people. These other people include end-users, stakeholders, other people who have safety responsibilities. We need to communicate with the people who we have to work with. And we have to consult people. We may have to consult workers. We may have to consult the public, people that we put at risk, other duty holders who hold a duty to manage risk. That’s our cycle. That’s more realistic. In my experience as a safety engineer, this is much more realistic. A once-through process often doesn’t cut it.

Required Risk Reduction

We’re doing all this to drive risk down to an acceptable level. Well, what do we mean by that? Well, there are several different ways that we can do this, and I’ve got to illustrate it here. On the left-hand side of the slide, we have what’s usually known as the ALARP triangle. It’s this thing that looks a bit like a carrot where the width of the triangle indicates the amount of risk. So, at the top of the triangle, we’ve got lots of risks. And if you’re in the UK or Australia where I live, this is the way it’s done. So there will be some level of risk that is intolerable. Then if the risk isn’t intolerable, we can only tolerate it or accept it if it is ALARP or SFARP. And ALARP means that we’ve reduced the risk as low as reasonably practicable. And SFARP means so far as is reasonably practicable. Essentially, they’re the same thing – reasonably practical.

We must ensure that we have applied all reasonably practicable risk reduction measures. And once we’ve done so, if we’re in this tolerable or acceptable region, then we can live with the risk. The law allows us to do that.

That’s how it’s done in the UK and Australia. But in other jurisdictions, like the USA, you might need to use a different approach. A risk matrix approach as we can see on the right-hand side of this slide. This particular risk matrix is from the standard we’re about to look at. And we could take that and say, ‘We’ve determined what the risk is. There is no absolute limit on how much risk we can accept. But the higher the risk, the more senior level of sign-off from management we need’. In effect, you are prioritizing the risk. So you only bring the worst risks to the attention of senior management. You are asking  ‘Will you accept this? Or are you prepared to spend the money? Or will you restrict the operational system to reduce the risk?’. This is good because it makes people with authority consider risks. They are responsible and need to make meaningful decisions.

In short, different approaches are legal in different jurisdictions.

Summary of Module

In Module Two, we’ve asked ourselves, ‘How can we deal with real-world complexity?’. And one way that’s developed to do that is System Safety. System Safety is where we take a systematic approach to safety. This approach applies to both the system itself – the product – and the process of System Safety.

We address product and process. We need that rigorous process to give us confidence that what we’ve done is good enough. We have a realistic, useful and powerful process that enables us to put things in context. It helps us to communicate with everyone we need to, to consult with those that we have a duty to consult with. And also, we put around the basic risk process, this monitoring and review. And of course, we analyze risk to reduce it to acceptable levels. So we’ve got to treat the risk or reduce it or control it in some way to get it to those acceptable levels. In the end, it’s all about getting that required risk reduction to work. That reduction makes the risk acceptable to expose human beings to, for the benefit that it will give us.

This is Module 2 of SSRAP

This is Module 2 from the System Safety Risk Assessment Program (SSRAP) Course. Risk Analysis Programs – Design a System Safety Program for any system in any application. You can access the full course here.

You can find more introductory lessons at Start Here.

Categories
Safety Analysis

Risk Analysis Programs

Risk Analysis Programs – Design a System Safety Program for any system in any application.

Introduction to the System Safety Risk Analysis Programs Course.

Risk Analysis Programs: Learning Objectives

At the end of this course, you will be able to:

  • Describe fundamental risk concepts;
  • Explain what a system safety approach is and does;
  • Define what a risk analysis program is;
  • List the hazard analysis tasks that make up a program;
  • Select tasks to meet your needs;
  • Design a tailored analysis program for any application; and
  • Know how to get more information and resources.
Risk Analysis Programs: Click Here for the Transcript

Hello and welcome to this course on Systems Safety Risk Analysis Programs. I’m Simon Di Nucci, The Safety Artisan, and I’ve been a safety engineer and consultant for over 20 years. And I have worked on a wide range of safety programs doing risk analysis on all kinds of things. Ships, planes, trains, air traffic management systems, software systems, you name it. I’ve worked in the U.K., in Australia, and on many systems from the US. I have also spent hundreds of hours training hundreds of people on safety. And now I’ve got the opportunity to share some of that knowledge with you online.

So, what are the benefits of this course?

First of all, you will learn about basic concepts. About system safety, what it is and what it does. You will know how to apply a risk analysis program to a very complex system and how to manage that complexity. So, that’s what you’ll know.

At the end of the course, you will also be able to do things that you might not have been able to do before. You will be able to take the elements of a risk analysis program and the different tasks. You’ll be able to select the right tasks and form a program to suit your application, whatever it might be. Whether you might have a full, high-risk bespoke development system, or you’re taking a commercial system off the shelf and doing something new with it. You might be taking a product and using it in a new application or a new location. Whatever it might be, you will learn how to tailor your risk analysis program. This program will give you the analyses you need. And to meet your legal and regulatory requirements. Once you’ve learned how to do this, you can apply it to almost any system.

Finally, you will feel confident doing this. I will be interpreting the terminology used in the tasks and applying my experience. So, instead of reading the standard and being unsure of your interpretation, you can be sure of what you need to do. Also, I will show you how you can get good results and avoid some of the pitfalls.

So, these are the three benefits of the program.

1) You will know what to do.

2) You will be able to perform risk program tasks, and …

3) You’ll feel confident doing those tasks.

At the end of the course, I will also show you where to find further resources. There are free resources to choose from. But there are also paid resources for those who want to take their studies to the next level. I hope you enjoy the course.

Get the supporting safety analysis courses here.

Categories
Work Health and Safety

Risk Management Code of Practice

In this 40-minute session, we look at the Risk Management Code of Practice (CoP). We cover: who has WHS duties; the four-step process; keeping records, appendices & a summary of detailed requirements; and further commentary. This CoP is one of the two that are generally applicable.

The Risk Management Code of Practice (Demo of the full, 40-minute, video).

Risk Management Code of Practice: Topics

Risk Management Code of Practice (CoP):

  • Who has WHS duties;
  • The four-step process;
  • Keeping records, appendices & summary of detailed requirements;
  • Further commentary; and
  • Where to get more information.

Risk Management Code of Practice: Transcript

Risk Management Code of Practice: Transcript

Hello, everyone, and welcome to the Safety Artisan. I’m Simon, your host, and today we’re going to be talking about the Risk Management Code of Practice.

Today we’re talking about the Risk Management Code of Practice. It’s a code of practice that I’ve used myself. I’ve used it to guide my work and to guide other people to help them in their work. I’ve used it to simplify the whole practice of what we do because once you know what you’re supposed to do, you can do that and then you don’t have to worry about working out what you need to do. And conversely, it’s giving you everything you need to do so you can do more if you want to, but you don’t have to. So, it makes life a lot easier and simpler. And then finally, you can use it to justify what you’ve done. That what you’ve done is correct, and what you’ve done is complete and is enough. So, it’s very useful and that’s why I’m teaching it because it makes life easier.

And I’m going to explain how to use it- you’ll still need to go away and read the Code of Practice, as you’ll see, to get all the details – but I’m going to go through the leading particulars and explain how to use it. And then finally, at the end of the session, I’m going to show you where you can get more help on this topic and indeed other related topics because this Code of Practice is one of several. And there’s one other that you must refer to. This Risk Management Code of Practice is one that you really can’t do without. There is one more and then the others are optional, depending on whether you’re working in their respective areas. Anyway, let’s get on with it.

Code of Practice: Risk Management

So we’re talking about the Risk Management Code of Practice, which is under Australian Work Health and Safety Law. Now, if you’re not operating in Australia, this is not a requirement for you but nevertheless, it does contain some very useful guidance. And I’ve seen similar requirements in the US and in the UK, and I suspect all across the English-speaking world.

Topics for this Session

So, what we’re going to cover today. First of all, who has WHS duties because it’s a wider group of people than you might think it is. There’s the four-step process for actually doing risk management. And then I think we’ve got a slide each on keeping records, the appendices in the Code of Practice, and a summary of the detailed requirements in the Code of Practice. Then I’ve provided some further commentary and, as I’ve said before, where to get more information.

Who has WHS Duties?

So, first of all, who has WHS duties? Well, it’s kind of everybody. First of all, if you are a person conducting a business or undertaking or a PCBU for short, then you have duties. And it says business or undertaking, so it includes voluntary groups, non-profit, government, military, you name it. It doesn’t have to be a commercial business. Then you have duties if you are a designer, manufacturer, importer, supplier, or if you install test or commission plant substances or structures. So again, a wide range of people.

And it’s not just about managing safety in a workplace. There’re lots of duties on duty holders with upstream software- sorry not software, upstream safety duties. Like designers and manufacturers. Then finally, officers have additional duties and an officer basically is like a director of a company that sort of level. So, senior management with control over resources and they have to provide due diligence. So, there’s a bunch of requirements on them as well. And then, of course, there’s the workers and any visitors. They’ve got to cooperate and take reasonable care of themselves and look out for each other, which is all very important.

And as it says, and this is a quote from the CoP, “A person can have more than one duty at the same time, and more than one person can share the same duty”. So, you can’t go playing tag, as it were. A sort of a responsibility tag. ‘It wasn’t me. It was him. Governor!’ The court ultimately decides who is responsible.

A Four-Step Process

So, in our four-step process, we have; first of all, we have to identify hazards. We have to assess the risks. So, we need to look at causes and consequences. And the CoP doesn’t say this, but exposure comes into it as well. So, a risk might be present, but if nobody is exposed to that risk, then you can’t hurt them. So, that’s an important point to remember. And controlling exposure is important to one degree or another in almost all areas, but very important in certain industries. Those industries that have got the real estate to be able to separate the risky thing from the human and this is very useful. So step three, we have to control risks. And then step four, we have to review control measures because it’s recognized that these control measures will be in place for some time, for the lifetime of whatever it is we’re doing or undertaking. So, they need to be periodically reviewed and there’s guidance on that.

Now, I keep saying guidance – take a look at the introduction to Codes of Practice and you will see why Codes of Practice are a bit more than guidance. They are guidance that you cannot afford to ignore because if things go wrong, you will get hung out to dry based on what CoP said you should have done. So, if you are ignorant of what CoP said and haven’t done it, then you’re stuffed basically before you even start. That’s point one to note.

And secondly, you’ll notice in the diagram on the left, we’ve got management commitment at the centre and we’ve got consultation all the way around. And there’s another Code of Practice, the Code of Practice on Communication, Cooperation and Coordination . So the C,C&C CoP and that is the other CoP that is essential. So, this one and the C, C and C CoP you must have a look at because they apply to everything in effect. Let’s move on.

Step 1, Identify Hazards

So, first of all, we need to identify hazards. Now, CoP is written for any Australian business or undertaking, so it’s pretty basic. It’s pretty pragmatic, but it’s pretty basic and it’s got a workplace focus. So, it says inspect the workplace, look around, talk to your workers. Now, I work in a business and day job for a consultancy where we, generally speaking, are not looking at an existing workplace, but we’re helping a customer buy or assure a complex product that’s going to come into service at some time in the future. So, there are no current workers to discuss, but we always do try and include end-user representatives in our safety workshops. So, you may not be able to consult workers directly, but you should try and include people who have relevant work experience.

Secondly, the CoP tells us to use good work design and safe design. Now that’s a whole topic in itself and I’ve got some guidance on safe design. If you go to that safety artisan.com page on safe design (www.safetyartisan.com/welcome/safe-design), you will see it and I’ll take you through the subject and refer you on to the source material itself.

Thirdly, we need to consult supply chains and networks. I think that works two ways. First of all, when you get people to supply you stuff, make sure that they supply the data that you need. The safety data, all the information that you need to take and use the product safely. And that’s part of the duty on all of these duty holders, on the designer, the manufacturer, the importer, the supplier. They all have duties to pass on the relevant safety information but make sure you ask for it in your contract. And secondly, suppliers, particularly if you’re buying an expensive piece of kit off them, suppliers can be an excellent source of information. If they’re the designers, then they know this kit better than anybody else. Make use of their expertise, contract them to do some work for you and take part of the load off you. They are best placed to do some of the work, so get them to do it.

And then fourthly, it says review available information. Now, this is very important. There’s historical information or there should be – it’s not always easy to come by sometimes. Do make the effort to get actual historical information for your piece of kit, maybe from the supplier. Or if you can’t do that, if it’s a new piece of kit, then try and get information on similar equipment, or services, or functionality, or go to a trade organization, or go to the regulator depending on what domain you’re in. Do look around for historical information. It is out there. It can be hard to find, but it is worth the effort because, again, the guidance requires it. So, if you don’t do it, if you don’t bother or you’ve not made reasonable efforts to do so, you’ll get clobbered if things go wrong.

And then it’s also advisable to compliment that historical information with diverse approaches. One of them is you can use a hazard checklist approach, and we talk about that in the session on preliminary hazard identification. There are lots of checklists freely available out there on the Internet. Some are general and some are more specific to different pieces of kit or different domains. Try and find the most relevant one for you and use it. And then maybe there are specific safety analyses techniques that you can use as well so have a go at those. And a lot of them are quite simple so don’t be put off. You don’t have to necessarily have to get an expensive consultant in to do this for you. A lot of these techniques are really quite simple and just require a bit of imagination and a little bit of self-discipline in the way you go about it. And I talk about analysis methods for hazard identification in that same session on Preliminary Hazard Identification (PHI).  

So, that’s identifying hazards.

Step 2, Assess Risks

Step two, we need to assess the risks. So, if we recall risk is a combination of likelihood and severity. So, how likely is the harm could arise? And how severe is that harm? The way to do that, the CoP says, is to work out how hazards may cause harm. And as always, don’t be afraid to ask the dumb questions. That’s part of my job as a consultant. You’re allowed to turn up and ask dumb questions. Or maybe sensitive questions that nobody in the firm dares to ask because they think they get fired. So, be brave and do try and work out how to ask the questions in a non-threatening way, but do ask the questions.

Work out how severe the harm could be. What is the worst credible consequence? And also, to keep it simple, what’s the worst direct consequence? Yes, you can come up with a fanciful chain of events that will lead to ‘it’s the end of the world as we know it’, but keep it direct would be my advice. At least to start with. It’s better to get a range of stuff than to work one scenario to the nth degree, I would suggest.

Then work out the likelihood of that harm occurring. Very often the most severe harm can only occur when there is a particular combination of circumstances. And if you read any kind of accident report, even in the press, you’ll very often say this was happening and it just so happened on this particular day that somebody wasn’t available to supervise and then this went wrong and something else went wrong. And then the final result of this chain of consequences was somebody gets hurt. So, do factor in all of those things.

There are probably lots of existing controls already unless you’re doing something very novel indeed, which is unusual. So, do look at what’s there and record it all. Conversely, do be aware of the ‘it will never happen brigade’ is I’ve met several people who say, ‘Oh, that will never happen; or was it ‘No British pilot would be stupid enough to do that. Ho, ho, ho.’ I was foolish enough to believe that. Anyway, that’s another story. So, don’t believe the people who say, ‘It can never happen’. Well, if I say, ‘OK, what’s the justification? Why can it never happen? Where’s the evidence for that claim?’ So, do dig into those responses.

There’s more detail in the Code of Practice. There are some good questions to ask in the workplace. And with a bit of imagination, you can take your imaginary piece of kit and sort of think about it in the workplace and go, ‘Well, let’s think up a suitable question.’ So, there’s good guidance in there. Historical data can’t be beat as a reality check and it shuts up the naysayers as well because if you can pull out information, say, ‘Well this accident has happened and it’s happened lots of times to lots of good people who thought they were clever’. So, it shuts up the naysayers do work hard to get the historical data. It’s fantastic if you can get it.

And then, as I said before, there are multiple specialist cause and consequence analysis techniques available. I talk about some of them and in other posts that I’ve already done, and I will talk about more in the future. But you may not need that level of sophistication. It’s always better to do some good basic work as early as you can. Then maybe if you come up against something and say, ‘We’re not cracking this. We suspect there’s a problem, but we can’t be sure’ then think about bringing out big guns. But if you’ve done the basic work first, that will really help you zero in on the areas where you think you need to do more work.

Step 3, Control Risks

The third one, controlling risks. Really, this is what it’s all about because you can do all the analysis you like, but you don’t do analysis for the sake of it. You do analysis in order to inform your selection of risk controls. And we are required to use a hierarchy of control measures, and that’s a legal requirement in Australia. It’s also a requirement in other jurisdictions and in other many other standards – safety standards that you’ll see it just may not be called this. But it will talk about more and less effective controls.

At the top of the control hierarchy, we’ve got the most effective control which is to eliminate the risk entirely. And by that, I mean you get rid of it. Let’s say you’re working in an explosive atmosphere and you’ve decided you don’t want any electrical devices in that explosive atmosphere. So, if you need to have power for machinery, you’re going to do it with pneumatics, let’s say, or hydraulics. So, you’ve eliminated the electrical risk. Elimination does not mean massaging the probability figures to get them very low and then you have eliminated the risk you have not. You’ve just played games with probability figures. So first off, that’s what elimination really means.

The second level, you’ve got three choices. We can substitute something hazardous with a safer alternative. I’ve mentioned getting rid of electricity entirely. You could say, ‘Well, I’ve got hydraulics, but they can burst and cause damage so I’ll have something else. Or let’s say there was a particular lubricant, which is ideal, but actually it’s quite dangerous this lubricant, so we’ll pick something safer. Maybe it doesn’t perform quite as well. Or a refrigerant, let’s say, an ideal refrigerant might be a potent greenhouse gas so we go ‘We’re going to have something else instead’.

You can isolate the hazard from people – I’ve spoken about that before. Some industries you’ve got a lot of real estate to play with. You can keep the hazard away from people. Or you can reduce the risk through engineering controls. And by engineering controls, I mean, you can build a safety feature or an interlock or something physically into the product. You’re not relying on a person to avoid the risk. It’s been done for them. It’s automatic or built-in.

At third level, we can use admin controls. So we can give people procedures and rules and we can say, ‘Do this, don’t do that’. And most of the time they’ll probably do it and obey the rules, but sometimes they won’t. And sometimes for good reason, by the way, because people come up with ridiculous rules that can’t be obeyed or that make the task or the job so difficult that people break the rules all the time because that’s the only way to get the job done effectively. So, do be aware of putting silly controls onto people because they won’t get obeyed. It’s your responsibility to consult the workers and come up with something practical.

And then finally, we can use personal protective equipment. Now that doesn’t do anything to the probability of the accident, but it reduces the severity. So, for example, if I’m wearing a hard hat, something falls on my head. It reduces the severity of the accident. If I’m wearing protective goggles and there’s a spark or a piece of debris flies out of the machine. If I’m wearing the goggles, it just bounces off probably and saves my eyes. So, there’s a couple of really good examples of where the PPE will help us. And of course, in this season of COVID, we’ve all got PPE bonkers. It’s become headline news all over the world. So, we all now know what PPE is, which is great. Well, and it’s not great. It’s terrible, but it’s good for knowledge.

So, we have to work through that hierarchy in that order. We have to see whether it’s feasible to eliminate the risk to start at the top with the most effective controls and work our way down. We have to do that. And the subject of another chat, another lesson, we have to apply all reasonably practical controls in order to say that we have eliminated or minimized risks SFARP. So far as is reasonably practicable. So, we’ve got to apply all reasonably practical controls. I’ll explain exactly what that means in a separate session.

Aside: Control Effectiveness

A Quick aside: are controls effective? I’ve sort of hinted at this before about the admin stuff. How do we get effective controls? Well, the CoP says we need people to be accountable for health and safety. We need maintenance of plant and equipment. We need up to date training and competency for our people. We need up to date hazard information – that’s a duty in its own right. And we need regular review and consultation. And you’ll find out about that in the CC&C CoP in my next lesson.

Now, these things are required everywhere, they can be achieved informally. If you work in a high-risk industry, you’ll probably have a thing called a safety management system. And your safety management system will be documented in a safety management plan. And typically, the safety management system is the thing that delivers all of these things, all five of these things and much more. So, that’s what you’ll probably end up doing.

First thing to say on that, of course, is that this information has got to be generated. You’ve got to get it from source and it’s usually the designer, the manufacturer, and the installer, and the testers who can provide this information. So, do make sure that you are imposing requirements on your suppliers, on your subcontractors to do this stuff and to provide you with the information. It is their duty to do so. It’s a legal duty, but you’re probably still going to have to pay for it and say when you want it and in what format that’s most useful to you and all the other good stuff.

Step 4, Reviewing Controls

Step four, which is maybe not so obvious. We’ve got some controls, we’re up and running, we need to review those controls. Well, why would we review them? First of all, if you’ve discovered that the control measure is not effective. So, you might have had some incident data., you might’ve had some near misses. Or you might have some reliability data that says ‘My control isn’t as reliable as I thought it was going to be’. But of course, to be aware of that, you’ve got to be collecting this information and you’ve got to be on the lookout for it.

So, you do need a workable incident reporting system and you do need to encourage people to use it and use it either anonymously or honestly. So, that’s where a good safety culture comes in, where you do not punish people for telling the truth. Where you encourage and reward them for the reporting stuff and making things better, you champion. And that’s where management commitment comes in.

The other point where the guidance says you have to do it is if you’re making any kind of change that’s likely to alter or give rise to new risks and you suspect that the existing control measures may not be effective. So, you’re going to make some kind of change – you’ve got to review what you’re doing. But of course, how would the PCBU know that unless they’d actually sort of basically documented the baseline situation? So, you’ve got to have some kind of control over your workplace or over your product or functionality to know what your current situation is and to know that a change is coming. You’ve got to have some kind of baseline control and change control to be able to do that. As I say, it doesn’t have to be that complicated, you just control what goes on at the workplace.

You’ve got to do it if you’ve identified a new hazard or risk. Once you’ve identified something, you’ve got to kind of start from scratch. But that’s okay because hopefully, you’ve already got all of the background analysis that you’ve done. So, you know what you’ve done in the past and therefore you can spot what the delta is. I’m anticipating the record-keeping, but this is where good record keeping really helps you when it comes to managing change. Because if you’ve documented the baseline and understand it, change is relatively straightforward.

Another reason, maybe you’ve consulted with workers or health and safety representatives and you’ve discovered those consultations suggest that a review is necessary. Or maybe a health and safety representative requests a review. In that case, you need to do one.

So those are the five cases where you must conduct a review of controls in order to keep things safe. And very often that’s how accidents occur. We start pretty well and then over a period of time, maybe years or decades, slowly our performance degrades over time or we get a bit blasé about stuff because we’ve never had a problem or so we think. If you’ve got poor incident and near-miss reporting, you won’t be aware of the problems that are happening. So, things slide over time so maybe it’s a good idea to have a periodic review even if you haven’t had any of these triggers. So, that’s a good idea as well. I don’t think it’s in the Code of Practice, but it’s sensible.

Keeping Records

Those are the four steps. Now let’s talk about these three other things, the first of which is keeping records. As it says, keeping records demonstrates what you have done. So, if you have a problem and the regulator comes round to inspect you or maybe even consider shutting you down or issuing a notice to improve or prohibition, then the fact that you’ve got some documentation is going to help you. And also helps you with downstream risk management activities, as I’ve just said.

Then also, there are some specific recordkeeping requirements for particular hazards. So, if you’re exposing people to noise or certain chemicals that may accumulate in the body, then you’re almost certainly going to have to have a monitoring program and a tracking program to keep an eye on this stuff and monitor people’s exposure. So, if you if you’ve got those particular hazards, then there’s going to be some very specific requirements on you that you have to meet and you must keep the records for the time periods required. In general, I would advise keeping the records for at least the life of the system, equipment service, whatever it is, and then a few years afterwards. Just in case there’s an issue that emerges later on. Exactly what you do is up to you.

And from a pragmatic point of view, I would say from experience precision and clarity in record-keeping is so important. Work hard on precision. It might sound like you’re being a bit anal about the way you record stuff if you feel you’re overdoing it, believe me, you are not. Make it simple. Make it crystal clear what you mean. Be very specific and precise as you can and then your records will be a lot more use. I put my hand up and say I’ve written stuff down and then a couple of years or even a few months later, I’ve gone back to something I’ve written down and thought, ‘What did I mean by that?’ Ambiguity is very easy to achieve so write some stuff down. Get somebody else to independently look at it for you and say’ What do you understand that to mean?’ Because English, unfortunately, is a very ambiguous language, very flexible.

Appendices

So, going back to the CoP, in particular, there are four appendices to the CoP. First of all, in A there’s a glossary of terms, which is very useful. Appendix B, we got some examples of a risk management process. Appendix C, there’s some help and guidance on assessing how things can go wrong. And then in Appendix D, there is a sample format blank risk register for you to use if you haven’t got anything else. And all of these examples and appendices, they are simple. They are workplace focused. As I say, if you work in a high-risk domain, maritime, aviation, you work with flammable chemicals or a big industrial plant, the CoP is not going to be sophisticated enough for your use. You’re going to have to meet and exceed it but you’re probably going to be using a standard that requires far more than what the CoP asks for. And that’s okay.

Detailed Requirements

But looking at it the other way around, the CoP is where everybody needs to start and there are some detailed requirements in each Code of Practice. And in this one, the words ‘must’, ‘requires’ or ‘mandatory’ tell you that there is a legal requirement that must be complied with. There are 35 ‘musts’, 39 ‘required’ of various kinds, and three instances are ‘mandatory’ in this Code of Practice. So, you’ve got to obey them.

Then there’s the word ‘should’, which indicates a recommended course of action and ‘may’ is an option. There are 43 ‘shoulds’ in this document and 82 ‘mays’. Again, my advice would be if it’s a ‘should’, I would do it unless you’ve got a reason not to. In which case you should probably write down why you’re not doing it. And that’s perfectly okay. If it isn’t going to work in your circumstances, or you don’t think it’s reasonable to do something, or you’ve got another way of doing it, which is better. Great. Do that, write it down.

And then the ‘mays’ are options so if you think they’re going to be useful and helpful, do it. If not, you don’t have to. There’re the different levels of compliance that you’ve got in the Code of Practice. And those three levels are in all the Codes of Practice.

Commentary

So, I’ve gone through what’s in the Code of Practice, I’m just going to give you a brief resumé of what I think is good advice based on personal and practical experience. I’ve said it already, but a quick reminder, Code of Practice provide minimum requirements. So, you do need to start with CoP and probably as the risk gets higher in whatever industry you’re in, you need to do more with higher-risk or to manage higher-risk.

It does have a workplace focus, so it isn’t a lot to use if you’re a designer and you’re trying to work out ‘What safety margins do I need? I need to do a design trade-off’. I know I’ve sort of leaked into the final point. The CoP won’t help you do that. You’ll need a more sophisticated approach, probably based on standards and tolerability. So, the CoP won’t help you with this sophisticated design decisions and trade-offs, and how much margin is enough. You’re probably going to have to go to standards and industry good practice for that.

And, really, what we’re now talking about is, are the risks are SFARP. Have we done everything that’s reasonably practicable? So first of all, have we done enough? Look at the definition of reasonably practicable, which is in Section 18 of the WHS Act. And if you look at that definition, you’ll find that it is a risk assessment process. So, by following the risk management CoP, the risk assessment process, you will have inherently begun to address SFARP. And you need to do that to demonstrate that you reduce risks SFARP. Then deciding how much is enough, well that depends on the particular risk. A simple approach may suffice and for most instances, for some risks can have to do some more sophisticated work. Which will take you beyond the bounds of the CoP.

And then the last point I’m going to make is the Codes of Practice, not just this one but all of them will repay careful reading. There are some detailed requirements in there and they contain lots of good, sensible, pragmatic advice. And if you have to write a safety management plan or a hazard management plan, then do go to CoP and steal the wording. Don’t make stuff up when you don’t have to. If the CoP tells you what to do and that’s part of your solution just copy and paste it. Use it – you’re allowed to!

Do pay attention to the copyright where you go to do make sure you get the right version of CoP for your jurisdiction. So, if it’s a federal workplace you need the Commonwealth version of CoP. If it’s commercial, then you probably state and territory. So, go to the correct regulator’s website, find the right CoP. You will probably find that the copyright allows you to copy and paste absolutely everything out of the CoP. So, do that and save yourself some work. And also, if you’ve done that it’s very easy to demonstrate that you’ve met the requirements of CoP because you’ve copied them. What could be easier? Save yourself some hassle.

As a consultant, I never make up anything unless I can’t possibly avoid it. So, do use the stuff out there because CoP has been developed for you by a bunch of people in consultation. Lots of people have put a lot of hard work into coming up with a good CoP, which is authorised by the relevant government minister. So, use it, don’t ignore it. It’s there to help you.

Copyright & Attribution

Now, I’ve mentioned that you can dig this stuff out of the right website, and that’s exactly what I’ve done. So, any words that you see in italics, in speech marks, I have lifted from the Federal Register of legislation and I’m allowed to do so under the terms of the Creative Commons license. And as part of the terms of that license, I’m required to tell you that I got this stuff on the 15th of August 2020. But you should always go to the www.legislation.gov.au website to check that you’re using the latest version. Don’t rely on what I’ve said, go and check you using the latest version. And for more information on what you can and can’t do with this Creative Commons license, I’ve got a page at the Safety Artisan that sets out what my obligations are and you’ll be able to see that I’ve met them.

For More…

And then for more information, if you’d like to get free video lessons on safety and free previews of paid content, do please go look at the Safety Artisan channel on YouTube and hit that subscribe- Yes, please! And you will then be informed of whenever a new video comes out which you believe you will find very helpful. And then for all lessons and resources, you can go to www.safetyartisan.com. And as you can see, it’s a secure website, so you’re safe to browse there. Go and have a look at the stuff that’s on there. This lesson is there, as are many others.

End

So that’s the end of our lesson for today, and we’ve gone on for almost 40 minutes. That’s because there’s a lot of good stuff out there to talk about. So just remains me to say thanks very much for tuning in and bothering to listen to this. Thank you for supporting the Safety Artisan. Your subscription, your money, enables me to carry on doing this stuff, and I hope you and many others will find it helpful. So, thanks very much. Bye-bye.

End: Risk Management Code of Practice

You can find the Model Code of Practice here.  Back to the Topics Page.

Categories
Mil-Std-882E Safety Analysis

System of Systems Hazard Analysis

In this full-length (38-minute) session, The Safety Artisan looks at System of Systems Hazard Analysis, or SoSHA, which is Task 209 in Mil-Std-882E. SoSHA analyses collections of systems, which are often put together to create a new capability, which is enabled by human brokering between the different systems. We explore the aim, description, and contracting requirements of this Task, and an extended example to illustrate SoSHA. (We refer to other lessons for special techniques for Human Factors analysis.)

This is the seven-minute demo version of the full 38-minute video.

System of Systems Hazard Analysis: Topics

  • System of Systems (SoS) HA Purpose;
  • Task Description (2 slides);
  • Documentation (2 slides);
  • Contracting (2 slides);
  • Example (7 slides); and
  • Summary.

Transcript: System of Systems Hazard Analysis

Click here for the Transcript

Introduction

Hello everyone and welcome to the Safety Artisan. I’m Simon and today we’re going to be talking about System of Systems Hazard Analysis – a bit of a mouthful that. What does it actually mean? Well, we shall see.

System of Systems Hazard Analysis

So, for Systems of Systems Hazard Analysis, we’re using task 209 as the description of what to do taken from a military standard, 882E. But to be honest, it doesn’t really matter whether you’re doing a military system or a civil system, whatever it might be – if you’ve got a system of systems, then this will help you to do it.

Topics for this Session

Looking at what we’ve got coming up.

So, we look at the purpose of system of systems – and by the way, if you’re wondering what that is what I’m talking about is when we take different things that we’ve developed elsewhere, e.g. platforms, electronic systems, whatever it might be, and we put them together. Usually, with humans gluing the system together somewhere, it must be said, to make it all tick and fit together. Then we want this collection of systems to do something new, to give us some new capability, that we didn’t have before. So, that’s what I’m talking about when I say a system of systems. I’ll show you an example – it’s the best way. So, we’ve got a couple of slides on task description, a couple of slides or documentation, and a couple of slides on contracting. Tasks 209 is a very short task, and therefore I’ve decided to go through an example.

So, we’ve got seven slides of an example of a system of systems, safety case, and safety case report that I wrote. And hopefully, that will illustrate far better than just reading out the description. And that will also give us some issues that can emerge with systems of systems and I’ll summarize those at the end.

SOSHA Purpose

So, let’s get on. I’m going to call it the SOSHA for short; Systems of Systems Hazard Analysis. The purpose of the SOSHA, task 209, is to document or perform and document the analysis of the system of systems and identify unique system of systems hazards. So, things we don’t get from each system in isolation. This task is going to produce special requirements to deal with these hazards, which otherwise would not exist. Because until we put the things together and start using them for something new – We’ve not done this before.

Task Description (T209) #1

Task description: As in all of these tasks, the contractor shall perform and document an analysis of the system of systems to identify hazards and mitigation requirements. A big part of this, as I said earlier, we tend to use people to glue these collections, these portfolios, of systems together and humans are fantastic at doing that. Not always the ideal way of doing it, but sometimes it’s the only way of doing it within the constraints that we’ve got. The human is very important. The human will receive inputs from one or more systems and initiate outputs within the analysis and in fact within the real world, to be honest, which is what we’re trying to analyse. That’s probably a better way of looking at it.

And we’ve got to provide traceability of all those hazards to – it says – architecture locations, interfaces, data and stakeholders associated with the hazard. This is particularly important because with a system of systems each system tends to come with its own set of stakeholders, its own physical location, its own interfaces, etc, etc. The issue of managing all of those extraneous things and getting the traceability, it goes up. It is multiplied with every system you’ve got. In fact, I would say it was the square of. The example we’ll see: we’ve got three systems being put together in a system of systems and, in effect, we had nine times the amount of work in that area, I would say. I think that’s a reasonable approximation.

Task Description (T209) #2

Part two of the task description: The contractor will assess the risk of each hazard and recommend mitigation measures to eliminate the hazards. Or, very often, we can’t eliminate the hazards to reduce the associated risks. Then, as always with this standard, it says we’re going to use tables one, two and three, which are the severity, probability and the risk matrix that comes with the standard. Unless, of course, we have created or tailored our own matrix. Which we very often should do but it isn’t often done – I’ll have to do a session on how to do tailoring a matrix.

Then the contractor has got to verify and validate the effectiveness of those recommended mitigation measures. Now, that’s a really good point and I often see that missed. People come up with control measures or mitigation measures but don’t always assess how effective they’re going to be. Sometimes you can’t so we just have to be conservative but it’s not always done as well as it could be.

Documentation (T209) #1

So, let’s move on. Documentation: So, whoever does the analysis- the standard assumes it’s a contractor – shall document the results to include: you’ve got to describe the system of systems, the physical and functional characteristics of the system of systems, which is very important. Capturing these things is not a given. It’s not easy when you’ve got one system, but when you’ve got multiple systems, some of which are being misused to do something they’ve never done before, perhaps, then you’ve got to take extra care.

Then basically it says when you get more detail of the individual systems you need to supply that when it becomes available. Again, that’s important. And not only if the contractor supplies it, who’s going to check it? Who’s going to verify it? Etc., etc.

Documentation (T209) #2

Slide two on documentation: We’ve got to describe the hazard analysis methods and techniques used, providing a description of each method and technique used, and the assumptions and the data used in support. This is important because I’ve seen lots of times where you get a hazard analysis’ results and you only get the results. It’s impossible to verify those results or validate them to say whether they’ve been done in the correct context. And it’s impossible to say whether the results are complete or whether they’re up to date or even whether they were analysing the correct system because often systems come in different versions. So, how do you know that the version being analysed was the version you’re actually going to use? Without that description, you don’t know. So, it’s important to contract for these things.

And then hazard analysis results. What contents and formats do you want? It’s important to say. Also, we’re going to be looking to put the key items, the leading particular’s, from the results. The top-level results are going to go into the hazard tracking system which is more commonly known as a hazard log or a risk register, whatever it might be. Might be an Excel spreadsheet, might be a very fancy database, but whatever it’s going to be you’re going to have to standardize your fields of what things mean. Otherwise, you’re going to have – the data is going to be a mess and a poor quality and not very usable. So, again, you’ve got a contract for these things upfront and make sure you make clear definitions and say what you want.

Contracting #1

Contracting; implicitly, we’ve been talking about contracting already, but this is what a standard says. So, the request for proposal or statement of work has got to include the following. Typically we have an RFP before we’ve got a contract, so we need to have worked out what we need really early in the program or project, which isn’t always done very well. To work out what you need the customer, the purchaser, has probably got to do some analysis of their own in order to work all this stuff out. And I know I say this every time with these tasks, but it is so important. You can’t just dump everything on the contractor and expect them to produce good results because often the contractor is hamstrung. If you haven’t done your homework to help them do their work, then you’re going to get poor results and it’s not their fault.

So, we’ve got to impose the requirement for the task if we want it or need it. We’ve got to identify the functional disciplines. So, which specialists are going to do this work? Because very often the safety team are generalists. They do not have specialist technical knowledge in some of these areas. Or maybe they are not human factor specialists. We need somebody in, some human factor specialists, some user representatives, people who understand how the system will be used in real life and what the real-world constraints are. We need those stakeholders involved – That’s very important. We’ve got to identify those architectures and systems which make up the SOS -very important. The concept of operations. SOS is very much about giving capability. So, it’s all about what are you going to do with the whole thing when you put it together? How’s all that going to work?

Contracting #2

Interesting one, E, which is unique, I think, to task 209, what are the locations of the different systems and how far apart are they? We might be dealing with systems where the distance between them is so great that transmission time becomes an issue for energy or communications. Let’s say you’re bouncing a signal from an aircraft or a drone around the world via a couple of satellites back to home base. There could be a significant lag in communications. So, we need to understand all of these things because they might give rise to hazards or reduce the effectiveness of controls.

Part F; what analysis, methods, techniques do you want to use? And any special data to be used? Again, with these collections of systems that becomes more difficult to specify and more important. And then do we have any specific hazard management requirements? For example, are we using standard definitions and risk matrix from a standard or have we got our own? That all needs to be communicated.

Example #1

So, that is the totality of the task. As you can see, there’s not much to Task 209, so I thought it would be much more helpful to use an example, an illustration, and as they used to say in children’s TV, “Here’s one I made earlier” because a few years ago I had to produce a safety case report. I was the safety case report writer, and there was a small team of us generating the evidence, doing the analysis for the safety case itself.

What we were asked to do is to assure the safety of a system and – in fact, it was two systems but I just treat it as one – of a system for guiding aircraft onto ships in bad weather. So, all of these things existed beforehand. The aircraft were already in service. The ships were already in service. Some of the systems were already in service, but we were putting them together in a new combination. So, we had to take into account human factors. That was very important. We’ll see why in just a moment.

The operating environment, which was quite demanding. So, the whole point is to get the aircraft safely back to the ships in bad weather. They could do it in good weather you could do it visually, but in bad weather, visual wasn’t going to cut it. So, the operating environment- we were being asked to operate in a much more difficult environment. So, that changed everything and drove everything.

We’ve got to consider operating procedures because, as we’re about to see, people are gluing the systems together. So, how do they make it work? And also got to think about maintenance and management. Although in actual fact, we didn’t really consider maintenance and management that much. As an ex-maintainer, this annoys me, but the truth is people are much more focused on getting their capability and service. Often, they think about support as an afterthought. We’ll talk about that one day.

Example #2

Here’s a little demonstration of our system of systems. Bottom right-hand corner, we’ve got the ship with lots of people on the ship. So, if the aircraft crashes into it that could be bad news, not just for the people in the aircraft, but for the people on the ship – big risks there!

We’ve got our radar mounted on the ship so the ship is supplying the radar with power and control and data, telling it where to point for example. Also, the ship might be inadvertently interfering with the radar. There are lots of other electronic systems on the ship. There are bits of the ship getting in the way of the radar, depending on where you’ve put it, and so on and so forth. So, the ship interacts with the radar, the radar interacts with the ship, radars producing radiation. Could that be doing anything to the ship systems?

And then the radar is being operated. Now, I think that symbol is meant to indicate a DJ, but we’ve got the DJ wearing headphones and we got a disk there but it looks like a radar scope to me. So, I’ve just hijacked that. That’s the radar operator who is going to talk to the pilot and give the pilot verbal commands to guide them safely back to the ship. So, that’s how the system works.

In an ideal world, the ship would use the radar and then talk electronically direct to the aircraft and guide it – maybe automatically? That would be a much more sensible setup. In fact, that’s often the way it’s done. But in this particular case, we had to produce a bit of a – I hesitate to call it a lash-up because it was a multi-million-dollar project, but it was a bit of a lash-up.

So, there is the human factors. We’ve got a radar operator doing quite a difficult job and a pilot doing a very difficult job trying to guide their aircraft back onto the ship in bad weather. How are they going to interact and perform? And then lastly, as I alluded to earlier, the aircraft and the ship do actually interact in a limited way. But of course, it’s a physical interaction, so you can actually hurt people and of course, if we get it wrong, the aircraft interacts with the surface of the ocean, which is very bad indeed for the aircraft. So, we’ve got to be careful there. So, there’s a little illustration of our system of systems.

Example #3

And – this is the top-level argument that we came up with – it’s in goal structuring notation. But don’t worry too much about that – We’ll have a session on how to do GSN another time.

So, our top goal, or claim if you like, is that our system of systems is adequately safe for the aircraft to locate and approach the ship. So, that’s a very basic, very simple statement, but of course, the devil is in the detail and all of that detail we call the context. So, surrounding that top goal or claim, we’ve got descriptions of the system, of the aircraft and the ship. We got a definition of what we mean by adequately safe and we’ve got safety targets and reporting requirements.

So, what supports the top goal? We’ve got a strategy and after a lot of consultation and designing the safety argument, we came up with a strategy where we said, “We are going to show that all elements of the system of systems are safe and all the interactions are safe”. To do that, we had to come up with a scope and some assumptions to underpin that as well to simplify things. Again, they sit in the context, we just keep the essence of the argument down the middle.

And then underneath, we’ve got four subgoals. We aim to show that each system equipment is safe to operate, so it’s ready to be operated safely. Then each one is safe in operation so it can be operated safely with real people, etc. And then we’ve got all system safety requirements are satisfied for the whole collection of stuff and then finally that all interactions are safe. So, if we can argue all four of those, we should have covered everything. Now, I suspect if I did this again today, I might do it slightly differently. Maybe a little bit more elegantly, but that’s not the point. The point is, we came up with this and it worked.

Example #4

So, I’m going to unpack each one very briefly, just to illustrate some points. First of all, each component system is safe to operate. Each of these systems, bar one, had all been purchased already, sometimes a long time ago. They all came with their own safety targets, their own risk matrices, etc, etc. So, we had to make sure that when an individual system said, “This is what we’ve got to achieve” that that was good enough for the overall system of systems. So, we had to make sure that each system met its own safety requirements and targets and that they were valid in context.

Now, you would think that double-checking existing systems would be a foregone conclusion. In reality, we discovered that the ship’s communication system and its combat data system were not as robust as assumed. We discovered some practical issues were reported by stakeholders and we also discovered some flaws in previous analysis that had been accepted a long time ago. Now, in the end, those problems didn’t change the price of fish, as we say. It didn’t make a difference to the overall system of systems.

The frailty of the ship’s comms got sorted out and we discovered it didn’t actually matter about the combat system. So, we just assumed that the data coming out of the combat system was garbage and it didn’t make any difference. However, we did upset a few stakeholders along the way. So beware, people don’t like discovering that a system that they thought was “tickety-boo” was not as good as they thought.

Example #5

The second goal was to show that the system of systems is safe in operation. So, we looked at the actual performance. We looked at test results of the radar and then also we were very fortunate that trials of the radar on the ship with aircraft were carried out and we were able to look at those trials reports. And once again, it emerged that the system in the real world wasn’t operating quite as intended, or quite as people had assumed that it would. It wasn’t performing as well. So, that was an issue. I can’t say any more about that but these things happen.

Also, a big part of the project was we included the human element. So, as I’ve said before, we had pilots and we had radar air traffic talk-down operators. So, we brought in some human factors specialists. They captured the procedures and tasks that the pilots and the radar operators had to perform. They captured them with what’s called a Hierarchical Task Analysis, they did some analysis of the tasks and what could go wrong. Then they created a model of what the humans were doing and ran it through a simulation several thousand times. So in that way, they did some performance modelling.

Now, they couldn’t give us an absolute figure on workload or anything like that but what they could do – fortunately, our new system was replacing an older system which was even more informally cobbled together than the one that we were we were bringing in. And so, the Human Factor specialists were able to compare human performance in the old system vs. human performance with the new system. Very fortunately, we were pleased to find out that the predicted performance was far better with the new system. The new system was much easier to operate for both the pilots and the talk-down radar operators. So, that was terrific.

Example #6

So, the third one; All system of systems safety requirements are satisfied. Now, this is a bit more nebulous, this goal, but what it really came down to was when you put things together, very often you get what’s called emergent behaviour. As in things start to happen that you didn’t expect or you didn’t predict based on the individual pieces. It’s the saying, two plus two equals five. You get more out of a system – you get synergy for good or ill out when you start putting different things together.

So, does the whole thing actually work? And broadly speaking, the answer was yes, it works very well. There were some issues, a good example the old radar that they used to use to talk the planes down was a search radar so the operator could see other traffic apart from the plane they were they were guiding in. Now, the operator being able to see other things is both good and bad because on the one hand gives them improved situational awareness so they can warn off traffic if it’s a collision situation develops. But also, it’s bad because it’s a distraction for the operator. So, it could have gone either way.

So, the new radar was specialized. It focused only on the aircraft being talked down. So, the operator was blind to other traffic. So that was great in terms of decreasing operator workload and ultimately pilot workload as well. But would this increase the collision risk with other traffic? And I’ll talk about that in the summary briefly.

Example #7

And then our final goal is to show that all interactions are safe between the guidance system, the aircraft and the ship. This was a non-trivial exercise because ships have large numbers of electronic systems and there’s a very involved process to go through to check that a new piece of kit doesn’t interfere with anything else or vice versa.

And also, of course, does the new electronic system/the new radar does the radiation effect ship? Because you’ve got weapons on the ship and some of those explosive devices that the weapons uses are electrically initiated. So, could the radiation set off an explosion? So, all of those things had to be checked. And that’s a very specialized area.

And then we’ve got, does the system interfere with the aircraft and the aircraft with the system? What about the integration of the ship and the aircraft and the aircraft to the ship? Yet another specialized area where there’s a particular way of doing things. And of course, the aircraft people want to protect the aircraft and the ship people want to protect the ship. So, getting those two to marry up is also another one of those non-trivial exercises I keep referring to but it all worked out in the end.

Summary

Points to note: When we’re doing system of systems – I’ve got five points here, you can probably work some more points out from what I’ve said for yourself – but we’re putting together disparate systems. They’re different systems. They’ve been procured by different organizations, possibly, to do different things. The stakeholders who bought them and care about them have got different aims and objectives. They’ve got different agendas to each other. So, getting everyone to play nicely in the zoo can be challenging. And even with somebody pulling it all together at the top to say “This has got to work. Get with the program, folks!” there’s still some friction.

Particularly, you end up with large numbers of stakeholders. For example, we would have regular safety meetings, but I don’t think we ever had two meetings in a row with exactly the same attendees because with a large group of people, people are always changing over and things move up. And that can be a challenge in itself. We need to include the human in the loop in systems of systems because typically that’s how we get them all to play together. We rely on human beings to do a lot of translation work and in effect. So, how do the systems cope?

A classic mistake really with systems design is to design a difficult-to-operate system and then just expect the operator to cope. That can be from things as seemingly trivial as amusement park rides – I did a lesson on learning lessons from an amusement park ride accident only a month or two ago and even there it was a very complex system for two operators, neither of whom had total authority over the system or to be honest, really had the full picture of what was going on. As a result, there were several dead bodies. So, how did the operators cope, and have we done enough to support them? That’s a big issue with a system of systems.

Thirdly, this is always true with safety analysis, but especially so with system of systems. The real-world performance is important. You can do all the analysis in the world making certain assumptions and the analysis can look fine, but in the real world, it’s not so simple. We have to do analysis that assumes the kit works as advertised because you’ve got nothing else to go on until you get the test results and you don’t get them until towards the end of the program. So, you’re going down a path, assuming that things work, that they do what they say on the tin, and perhaps you then discover they don’t do what they say on the tin. Or they don’t do everything they say on a tin. Or they do what they say and they do some other things that you weren’t expecting as well and then you’ve got to deal with those issues.

And then fourthly, somewhat related to what I’ve just talked about, but you put systems together in an informal way, perhaps, and then you discover how they actually get on – what really happens. In reality, once you get above a certain level of complexity, you’re not really going to discover all the emergent behaviours and consequences until you get things into service and it’s clocked up a bit of time in service under different conditions in the real world. In fact, that was the case with this and I think with a system of systems, you’ve just got to assume that it’s sufficiently complex that that is the case.

Now, that’s not an unsolvable problem but, of course, how do you contract for that? Where you’ve got your contractors wanting you to accept their kit and pay them at a certain date or a certain point in the program, but you’re not going to find out whether it all truly works until it’s got into service and been in service for a while. So, how do you incentivize the contractor to do a good job or indeed to correct defects in a timely manner? That’s quite a challenge for system systems and it’s something that needs thinking about upfront.

And then finally, I’ve said, remember the bigger picture. It’s very easy when you’re doing analysis and you’ve made certain assumptions and you set the scope, it’s very easy to get fixated on that scope and on those assumptions and forget the real world is out there and is unpredictable. We had lots of examples of that on this program. We had the ship’s comms that didn’t always work properly, we couldn’t rely on the combat system, the radar in the real world didn’t operate as well as it said in the spec, etc, etc. There were lots of these things.

And, one example I mentioned was that with the new radar, the radar operator does not see any traffic other than the aircraft that is being guided in. So, there’s a loss of situational awareness there and there’s a risk, maybe an increased risk, of collision with other traffic. And that actually led to a disagreement in our team because some people who had got quite fixated on the analysis and didn’t like the suggestion that maybe they’d missed something. Although it was never put in those terms, that’s the way they took it. So, we need to be careful of egos. We might think we’ve done a fantastic analysis and we’ve produced hundreds of pages of data and fault trees or whatever it might be but that doesn’t mean that our analysis has captured everything or that it’s completely captured what goes on in the real world because that’s very difficult to do with such a complex system of systems.

So, we need to be aware of the bigger picture, even if it’s only just qualitatively. Somebody, a little voice, piping up somewhere saying, “What about this? And we thought about that? I know we’re ignoring this because we’ve been told to but is that the right thing to do?” And sometimes it’s good to be reminded of those things and we need to remember the big picture.

Copyright Statement

Anyway, I’ve talked for long enough. It just remains for me to point out that all the text in quotations, in italics, is from the military standard, which is copyright free but this presentation is copyright of the Safety Artisan. As I’m recording this, it’s the 5th of September 2020.

For More …

And so if you want more, please do subscribe to the Safety Artisan channel on YouTube and you can see the link there, but just search for Safety Artisan in YouTube and you’ll find us. So, subscribe there to get free video lessons and also free previews of paid content. And then for all lessons, both paid and free, and other resources on safety topics please visit the Safety Artisan at www.safetyartisan.com/  where I hope you’ll find much more good stuff that you find helpful and enjoyable.

End: System of Systems Hazard Analysis

So, that is the end of the presentation and it just remains for me to say thanks very much for watching and listening. It’s been good to spend some time with you and I look forward to talking to you next time about environmental analysis, which is Task 210 in the military standard. That’ll be next month, but until then, goodbye.

Categories
Blog Work Health and Safety

Introduction to WHS Codes of Practice

In the 30-minute session, we introduce Australian WHS Codes of Practice (CoP). We cover: What they are and how to use them; their Limitations; we List (Federal) codes; provide Further commentary; and Where to get more information. This session is a useful prerequisite to all the other sessions on CoP.

Codes of Practice: Topics

  • What they are and how to use them;
  • Limitations;
  • List of CoP (Federal);
  • Further commentary; and
  • Where to get more information.

Codes of Practice: Transcript

Click Here for the Transcript

Hello and welcome to the Safety Artisan, where you will find professional, pragmatic, and impartial teaching and resources on all thing’s safety. I’m Simon and today is the 16th of August 2020. Welcome to the show.

Introduction

So, today we’re going to be talking about Codes of Practice. In fact, we’re going to be introducing Codes of Practice and the whole concept of what they are and what they do.

Topics for this Session

What we’re going to cover is what Codes of Practice are and how to use them – several slides on that; a brief word on their limitations; a list of federal codes of practice – and I’ll explain why I’m emphasizing it’s the list of federal ones; some further commentary and where to get more information. So, all useful stuff I hope.

CoP are Guidance

So, Codes of Practice come in the work, health and safety hierarchy below the act and regulations. So, at the top you’ve got the WHS Act, then you’ve got the WTS regulations, which the act calls up. And then you’ve got the Codes of Practice, which also the act calls up. We’ll see that in a moment. And what Codes of Practice do are they provide practical guidance on how to achieve the standards of work, health and safety required under the WHS act and regulations, and some effective ways to identify and manage risks. So, they’re guidance but as we’ll see in a moment, they’re much more than guidance. So, as I said, the Codes of Practice are called up by the act and they’re approved and signed off by the relevant minister. So, they are a legislative instrument.

Now, a quick footnote. These words, by the way, are in the introduction to every Code of Practice. There’s a little note here that says we’re required to consider all risks associated with work, not just for those risks that have associated codes of practice. So, we can’t hide behind that. We’ve got to think about everything. There are codes of practice for several things, but not everything. Not by a long way.

…Guidance We Should Follow

Now, there are three reasons why Codes of Practice are a bit more than just guidance. So, first of all, they are admissible in court proceedings. Secondly, they are evidence of what is known about a hazard, risk, risk assessment, risk control. And thirdly, courts may rely, or regulators may rely, on Codes of Practice to determine what is reasonably practicable in the circumstances to which the code applies. So, what’s the significance of that?

So first of all, the issue about being admissible. If you’re unfortunate enough to go to court and be accused of failing under WHS law, then you will be able to appeal to a Code of Practice in your defence and say, “I complied with the Code of Practice”. They are admissible in court proceedings. However, beyond that, all bets are off. It’s the court that decides what is anadmissible defence, and that means lawyers decide, not engineers. Now, given that you’re in court and the incident has already happened a lot of the engineering stuff that we do about predicting the probability of things is no longer relevant. The accident has happened. Somebody has got hurt. All these probability arguments are dust in your in the wake of the accident. So, Codes of Practice are a reliable defence.

Secondly, the bit about evidence of what is known is significant, because when we’re talking about what is reasonably practicable, the definition of reasonably practicable in Section 18 of the WHS act talks about what it is reasonable or what should have been known when people were anticipating the risk and managing it. Now, given that Codes of Practice were published back in 2012, there’s no excuse for not having read them. So, they’re pre –existing, they’re clearly relevant, the law has said that they’re admissible in court. We should have read them, and we should have acted upon them. And there’ll be no wriggling out of that. So, if we haven’t done something that CoP guided us to do, we’re going to look very vulnerable in court.  Or in the whatever court of judgment we’re up against, whether it be public opinion or trial by media or whatever it is.

And thirdly, some CoP can be used to help determine what is SOFARP. So in some circumstances, if you’re dealing with a risk that’s described a CoP, CoP is applicable. Then if you followed everything in CoP, then you might be able to claim that just doing that means that you’ve managed the risk SFARP. Why is that important? Because the only way we are legally allowed to expose people to risk is if we have eliminated or minimized that risk so far as is reasonably practicable, SFARP. That is the key test, the acid test, of “Have we met our risk management obligations? “And CoP are useful, maybe crucial, in two different ways for determining what is SFARP. So yes, they’re guidance but it’s guidance that we ignore at our peril.

Standards & Good Practice

So, moving on. Codes of Practice recognize, and I reemphasize this is in the introduction to every code of practice, they’re not the only way of doing things. There isn’t a CoP for everything under the sun. So, codes recognize that you can achieve compliance with WHS obligations by using another method as long as it provides an equivalent or higher standard of work, health and safety than the code. It’s important to recognize that Codes of Practice are basic. They apply to every business and undertaking in Australia potentially. So, if you’re doing something more sophisticated, then probably CoP on their own are not enough. They’re not good enough.

And in my day job as a consultant, that’s the kind of stuff we do. We do planes, trains and automobiles. We do ships and submarines. We do nuclear. We do infrastructure. We do all kinds of complex stuff for which there are standards and recognized good practice which go way beyond the requirements of basic Codes of Practice. And many I would say, probably most, technical and industry safety standards and practices are more demanding than Codes of Practice. So, if you’re following an industry or technical standard that says “Here’s a risk management process”, then it’s likely that that will be far more detailed than the requirements that are in Codes of Practice.

And just a little note to say that for those of us who love numbers and quantitative safety analysis, what this statement about equivalent or higher standards of health and safety is talking about  –We want requirements that are more demanding and more rigorous or more detailed than CoP. Not that the end –result in the predicted probability of something happening is better than what you would get with CoP because nobody knows what you would get with CoP. That calculation hasn’t been done. So, don’t go down the rabbit hole of thinking “I’ve got a quantitatively demonstrate that what we’re doing is better than CoP.” You haven’t. It’s all about demonstrating the input requirements are more demanding rather than the output because that’s never been done for CoP. So, you’ve got no benchmark to measure against in output terms.

The primacy of WHS & Regulations

A quick point to note that Codes of Practice, they are only guidance. They do refer to relevant WHS act and regulations, the hard obligations, and we should not be relying solely on codes in place of what it says in the WHS Act or the regulations. So, we need to remember that codes are not a substitute for the act or the regs. Rather they are a useful introduction. WHS ACT and regulations are actually surprisingly clear and easy to read. But even so, there are 600 regulations. There are hundreds of sections of the WHS act. It’s a big read and not all of it is going to be relevant to every business, by a long way. So, if you see a CoP that clearly applies to something that you’re doing, start with the cop. It will lead you into the relevant parts of WHS act and regulations. If you don’t know them, have a read around in there around the stuff that – you’ve been given the pointer in the CoP, follow it up.

But also, CoP do represent a minimum level of knowledge that you should have. Again, start with CoP, don’t stop with them. So, go on a bit. Look at the authoritative information in the act and the regs and then see if there’s anything else that you need to do or need to consider. The CoP will get you started.

And then finally, it’s a reference for determining SOFARP. You won’t see anything other than the definition of reasonably practicable in the Act. You won’t see any practical guidance in the Act or the regulations on how to achieve SOFARP. Whereas CoP does give you a narrative that you can follow and understand and maybe even paraphrase if you need to in some safety documentation. So, they are useful for that. There’s also guidance on reasonably practicable, but we’ll come to that at the end.

Detailed Requirements

It’s worth mentioning that there are some detailed requirements in codes. Now, when I did this, I think I was looking at the risk management Code of Practice, which will go through later in another session. But in this example, there are this many requirements. So, every CoP has the statement “The words ‘must’, ‘requires’, or ‘mandatory’ indicate a legal requirement exists that must be complied with.” So, if you see ‘must’, ‘requires’, or ‘mandatory’, you’ve got to do it. And in this example CoP that I was looking at, there are 35 ‘must’s, 39 ‘required’ or ‘requirement’ – that kind of wording – and three instances of ‘mandatory’. Now, bearing in mind the sentence that introduces those things contains two instances of ‘must’ and one of ‘requires’ and one of ‘mandatory’. So, straight away you can ignore those four instances. But clearly, there are lots of instances here of ‘must’ and ‘require’ and a couple of ‘mandatory’.

Then we’ve got the word ‘should’ is used in this code to indicate a recommended course of action, while ‘may’ is used to indicate an optional course of action. So, the way I would suggest interpreting that and this is just my personal opinion – I have never seen any good guidance on this. If it says ‘recommended’, then personally I would do it unless I can justify there’s a good reason for not doing it. And if it said ‘optional’, then I would consider it. But I might discard it if I felt it wasn’t helpful or I felt there was a better way to do it. So, that would be my personal interpretation of how to approach those words. So, ‘recommended’ – do it unless you can justify not doing it. ‘Optional’ – Consider it, but you don’t have to do it.

And in this particular one, we’ve got 43 instances of ‘should’ and 82 of ‘may’. So, there’s a lot of detailed information in each CoP in order to consider. So, read them carefully and comply with them where you have to work and that will repay you. So, a positive way to look at it, CoP are there to help you. They’re there to make life easy for you. Read them, follow them. The negative way to look at them is, ”I don’t need to do all this says in CoP because it’s only guidance”. You can have that attitude if you want. If you’re in the dock or in the witness box in court, that’s not going to be a good look. Let’s move on.

Limitations of CoP

So, I’ve talked CoP up quite a lot; as you can tell, I’m a fan because I like anything that helps us do the job, but they do have limitations. I’ve said before that there’s a limited number of them and they’re pretty basic. First of all, it’s worth noting that there are two really generic Codes of Practice. First of all, there’s the one on risk management. And then secondly, there’s the one on communication, consultation and cooperation. And I’ll be doing sessions on both of those. Now, those apply to pretty much everything we do in the safety world. So, it’s essential that you read them no matter what you’re doing and comply with them where you have to.

Then there are other codes of practice that apply to specific activities or hazards, and some of them are very, very specific, like getting rid of asbestos, or welding, or spray painting – or whatever it might be – shock blasting. Those have clearly got a very narrow focus. So, you will know if you’re doing that stuff. So, if you are doing welding and clearly you need to read the welding CoP. If welding isn’t part of your business or undertaking, you can forget it.

However, overall, there are less than 25 Codes of Practice. I can’t be more precise for reasons that we will come to in a moment. So, there’s a relatively small number of CoP and they don’t cover complex things. They’re not going to help you design a super –duper widget or some software or anything like that. It’s not going to help you do anything complicated. Also, Codes of Practice tend to focus on the workplace, which is understandable. They’re not much help when it comes to design trade –offs. They’re great for the sort of foundational stuff. Yes, we have to do all of this stuff regardless. When you get to questions of, “How much is enough?” Sometimes in safety, we say, “How much margin do I need?” “How many layers of protection do I need?” “Have I done enough?” CoP aren’t going to be a lot of use helping you with that kind of determination but you do need to have made sure you’ve done everything CoP first and then start thinking about those trade –offs, would be my advice. You’re less likely to go wrong that way. So, start with your firm basis of what you have to do to comply and then think “What else could I do?”

List of CoP (Federal) #1

Now for information, you’ve got three slides here where we’ve got a list of the Codes of Practice that apply at the federal or Commonwealth level of government in Australia. So, at the top highlighted I’ve already mentioned the ‘how’ to manage WHS risks and the consultation, cooperation, and coordination codes. Then we get into stuff like abrasive, blasting, confined spaces, construction and demolition and excavation, first aid. So, quite a range of stuff, covered.

List of CoP (Federal) #2

Hazardous manual tasks – so basically human beings carrying and moving stuff. Managing and controlling asbestos, and removing it. Then we’ve got a couple on hazardous chemicals on this page, electrical risks, managing noise, preventing hearing loss, and stevedoring. There you go. So, if you’re into stevedoring, then this CoP is for you. The highlighted ones we’re going to cover in later sessions.

List of CoP (Federal) #3

Then we’ve got managing risk of Plant in the workplace. There was going to be a Code of Practice for the design of Plant, but that never saw the light of day so we’ve only got guidance on that. We’ve got falls, environment, work environment, and facilities. We’ve got another one on safety data sheets for another one on hazardous chemicals, preventing falls in housing – I guess because that’s very common accident – safe design of structures, spray painting and powder coating, and welding processes. So, those are the list of – I think it’s 24 – Codes of Practice are applied by Comcare, the federal regulator.

Commentary #1

Now, I’m being explicit about which regulator and which set of CoP, because they vary around Australia. Basically, the background was the model Codes of Practice were developed by Safe Work Australia, which is a national body. But those model Codes of Practice do not apply. Safe Work Australia is not a regulator. Codes of Practice are implemented or enforced by the federal government and by most states and territories. And it says with variations for a reason. Not all states and territories impose all codes of practice. For example, I live in South Australia and if you go and look at the WorkSafe South Australia website or Safe Work – whatever it’s called – you will see that there’s a couple of CoP that for some reason we don’t enforce in South Australia. Why? I do not know. But you do need to think about these things depending on where you’re operating.

It’s also worth saying that WHS is not implemented in every state in Australia. Western Australia currently have plans to implement WHS, but as of 2020 but I don’t believe they’ve done so yet. Hopefully, it’s coming soon. And Victoria, for some unknown reason, have decided they’re just not going to play ball with everybody else. They’ve got no plans to implement WHS that I can find online. They’re still using their old OHS legislation. It’s not a universal picture in Australia, thanks to our rather silly version of government that we have here in Australia – forget I said that. So, if it’s a Commonwealth workplace and we apply the federal version of WHS and Codes of Practice. Otherwise, we use state or territory versions and you need to see the local regulator’s Web page to find out what is applied where. And the definition of a Commonwealth workplace is in the WHS Act, but also go and have a look at the Comcare website to see who Comcare police. Because there are some nationalised industries that count as a Commonwealth workplace and it can get a bit messy.

So, sometimes you may have to ask for advice from the regulator but go and see what they say. Don’t rely on what consultants say or what you’ve heard on the grapevine. Go and see what the regulator actually says and make sure it’s the right regulator for where you’re operating.

Commentary #2

What’s to come? I’m going to do a session on the Risk Management Code of Practice, and I’m also, associated with that, going to do a session on the guidance on what is reasonably practicable. Now that’s guidance, it’s not a Code of Practice. But again, it’s been published so we need to be aware of it and it’s also very simple and very helpful. I would strongly recommend looking at that guidance if you’re struggling with SFARP for what it means, it’s very good. I’ll be talking about that soon. Also, I’m going to do a session on tolerability of risk, because you remember when I said “CoP aren’t much good for helping you do trade–offs in design” and that kind of thing. They’re really only good for simple stuff and compliance. Well, what you need to understand to deal with the more sophisticated problems is the concept of tolerability of risk. That’ll help us do those things. So, I’m going to do a session on that.

I’m also going to do a session on consultation, cooperation, and coordination, because, as I said before, that’s universally applicable. If we’re doing anything at a workplace, or with stuff that’s going to a workplace, that we need to be aware of what’s in that code. And then I’m also going to do sessions on plant, structures and substances (or hazardous chemicals) because those are the absolute bread and butter of the WHS Act. If you look at the duties of designers, manufacturers, importers, suppliers, and installers, et cetera, you will find requirements on plant, substances and structures all the way through those clauses in the WHS Act. Those three things are key so we’re going to be talking about that.

Now, I mentioned before that there was going to be a Code of Practice on plant design, but it never made it. It’s just guidance. So, we’ll have a look at that if we can as well – Copyright permitting. And then I want to look at electrical risks because I think the electrical risks code is very useful. Both for electrical risks, but it’s also a useful teaching vehicle for designers and manufacturers to understand their obligations, especially if you operate abroad and you want to know, or if you’re importing stuff “Well, how do I know that my kit can be safely used in Australia?” So, if you can’t do the things that the electrical risk CoP requires in the workplace if your piece of kit won’t support that, then it’s going to be difficult for your customers to comply. So, probably there’s a hint there that if you want to sell your stuff successfully, here’s what you need to be aware of. And then that applies not just to electrical, I think it’s a good vehicle for understanding how CoP can help us with our upstream obligations, even though CoP applies to a workplace. That session will really be about the imaginative use of Code of Practice in order to help designers and manufacturers, etc.

And then I want to also talk about noise Code of Practice, because noise brings in the concept of exposure standards. Now, generally, Codes of Practice don’t quote many standards. They’re certainly not mandatory, but noise is one of those areas where you have to have standards to say, “this is how we’re going to measure the noise”. This is the exposure standard. So, you’re not allowed to expose people to more than this. That brings in some very important concepts about health monitoring and exposure to certain things. Again, it’ll be useful if you’re managing noise but I think that session will be useful to anybody who wants to understand how exposure standards work and the requirements for monitoring exposure of workers to certain things. Not just noise, but chemicals as well. We will be covering a lot of that in the session(s) on HAZCHEM.

Copyright & Attribution

I just want to mention that everything in quotes/in italics is downloaded from the Federal Register of Legislation, and I’ve gone to the federal legislation because I’m allowed to reproduce it under the license, under which it’s published. So, the middle paragraph there – I’m required to point that out that I sourced it from the Federal Register of legislation, the website on that date. And for the latest information, you should always go to the website to double–check that the version that you’re looking at is still in force and is still relevant. And then for more information on the terms of the license, you can go and see my page at the www.SafetyArtisan.com because I go through everything that’s required and you can check for yourself in detail.

For More…

Also, on the website, there’s a lot more lessons and resources, some of them free, some of them you have to pay to access, but they’re all there at www.safetyartisan.com. Also, there’s the Safety Artisan page at www.patreon.com/SafetyArtisan where you will see the paid videos. And also, I’ve got a channel on YouTube where the free videos are all there. So, please go to the Safety Artisan channel on YouTube and subscribe and you will automatically get a notification when a new free video pops up.

End

And that brings me to the end of the presentation, so thanks very much for listening. I’m just going to stop sharing that now. It just remains for me to say thank you very much for tuning in and I look forward to sharing some more useful information on Codes of Practice with you in the next session in about a month’s time. Cheers now, everybody. Goodbye.

There’s more!

You can find the Model WHS Codes of Practice here. Back to the Topics Page.

Categories
Mil-Std-882E Safety Analysis

Health Hazard Analysis

In this full-length (55-minute) session, The Safety Artisan looks at Health Hazard Analysis, or HHA, which is Task 207 in Mil-Std-882E. We explore the aim, description, and contracting requirements of this complex Task, which covers: physical, chemical & biological hazards; Hazardous Materials (HAZMAT); ergonomics, aka Human Factors; the Operational Environment; and non/ionizing radiation. We outline how to implement Task 207 in compliance with Australian WHS. (We refer to other lessons for specific tools and techniques, such as Human Factors analysis methods.)

This is the seven-minute-long demo. The full version is a 55-minute-long whopper!

Health Hazard Analysis: Topics

  • Task 207 Purpose;
  • Task Description;
  • ‘A Health Hazard is…’;
  • ‘HHA Shall provide Information…’;
  • HAZMAT;
  • Ergonomics;
  • Operating Environment;
  • Radiation; and
  • Commentary.

Health Hazard Analysis: Transcript

Click here for the Transcript

Introduction

Hello, everyone, and welcome to the Safety Artisan. I’m Simon, your host, and today we are going to be talking about health hazard analysis.

Task 207: Health Hazard Analysis

This is task 207 in the Mil. standard, 882E approach, which is targeted for defense systems, but you will see it used elsewhere. The principles that we’re going to talk about today are widely applicable. So, you could use this standard for other things if you wish.

Topics for this Session

We’ve got a big session today so I’m going to plough straight on. We’re going to cover the purpose of the task; the description; the task helpfully defines what a health hazard is; says what health hazard analysis, or HHA, shall provide in terms of information. We talk about three specialist subjects: Hazardous materials or hazmat, ergonomics, and operating environment. Also, radiation is covered, another specialist area. Then we’ll have some commentary from myself.

Now the requirements of the standard of this task are so extensive that for the first time I won’t be quoting all of them, word for word. I’ve actually had to chop out some material, but I’ll explain that when we come to it. We can work with that but it is quite a demanding task, as we’ll see.

Task Purpose

Let’s look at the task purpose. We are to perform and document a health hazard analysis and to identify human health hazards and evaluate what it says, materials and processes using materials, etc, that might cause harm to people, and to propose measures to eliminate the hazards or reduce the associated risks. In many respects, it’s a standard 882 type approach. We’re going to do all the usual things. However, as we shall see it, we’re going to do quite a lot more on this one.

Task Description #1

So, task description. We need to evaluate the potential effects resulting from exposure to hazards, and this is something I will come back to again and again. It’s very easy dealing in this area, particularly with hazardous materials, to get hung up on every little tiny amount of potentially hazardous material that is in the system or in a particular environment and I’ve seen this done to death so many times. I’ve seen it overdone in the UK when COSHH, a control of substance hazardous to health, came in in the military. We went bonkers about this. We did risk assessments up the ying-yang for stuff that we just did not need to worry about. Stuff that was in every office up and down the land. So, we need to be sensible about doing this, and I’ll keep coming back to that.

So, we need to do as it says; identification assessment, characterisation, control, and communicate assets in the workplace environment. And we need to follow a systems approach, considering “What’s the total impact of all these potential stressors on the human operator or maintainer?” Again, I come from a maintenance background. The operator often gets lots of attention because a) because if the operator stuffs up, you very often end up with a very nasty accident where lots of people get hurt. So, that’s a legitimate focus for a human operator of a system.

But also, a lot of organizations, the executive management tend to be operators because that’s how the organization evolves. So, sometimes you can have an emphasis on operations and maintenance and support, and other things get ignored because they’re not sexy enough to the senior management. That’s a bad reason for not looking at stuff. We need to think about the big picture, not just the people who are in control.

Task Description #2

Moving on with task description. We need to do all of this good stuff and we’re thinking about materials and components and so forth, and if they cause or contribute to adverse effects in organisms or offspring. We’re talking about genetic effects as well. Or pose a substantial present or future danger to the environment. So in 882, we are talking about environmental impact as well as human health impact. There is a there is an environmental task as well that is explicitly so.

Personally, I would tend to keep the human impact and the environmental impact separate because there are very often different laws that apply to the two. If you try and mix them together or do a sort of one size fits all analysis, you’ll frequently make life more difficult for yourself than you need to. So, I would tend to keep them separate. However, that’s not quite how the standard is written.

A Health Hazard is …

So what is a health hazard? As it says, a health hazard is a condition and it’s got to be inherent to the operation, etc, through to disposal of the system. So, it’s cradle to grave – That’s important. That’s consistent with a lot of Western law. It’s got to be capable of causing death, injury, illness, disability, or even in this standard, they’ve just reduced the job performance of personnel by exposure to physiological stresses.

Now I’m getting ahead of myself because, in Australia, health hazards can include psychological impacts as well, not just impacts on physical health. Now reduced job performance? – Are we really interested in minor stuff? Maybe not. Maybe we need to define what we mean by that. Particularly when it comes to operators or maintainers making mistakes, perhaps through fatigue that can have very serious consequences.

So, this analysis task is going to address lots of causes or factors that we typically find in big accidents and relate them to effects on human performance. Then it goes on to specify that certain specific hazards must be included chemical, physical, biological, ergonomic – for ergonomic, I would say human factors, because when you look at the standard, what we call ergonomics is much wider than the narrow definition of ergonomics that I’m used to.

Now, this is the first area that chops some material because where in a-d it says e.g. in those examples there is in effect a checklist of chemical, physical, biological, and ergonomic hazards that you need to look at. This task has its own checklist. You might recall when we talked about preliminary hazard identification, a hazard checklist is a very good method for getting broad coverage in general. Now, in this task, we have further checklists that are specific to human health. That’s something to note.

We’ve also got to think about hazardous materials that may be formed by test, operation, maintenance, disposal, or recycling. That’s very important, we’ll come back to that later. Thinking about crashworthiness and survivability issues. We’ve got to also think about it says non-ionizing radiation hazards, but in reality, we’ve got to consider ionizing as well. If we have any radioactive elements in our system and it does say that in G. So, we’ve got to do both non-ionizing and ionizing.

HHA Shall Provide Info #1

What categories of information should this health hazard analysis generate? Well, first of all, it’s got to identify hazards and as I’ve said or hinted at before, we’ve got to think about how could human beings be exposed? What is the pathway, or the conditions, or mode of operations by which a hazardous agent could come into contact with a person? I will focus on people. So, just because there is a potentially hazardous chemical present doesn’t mean that someone’s going to get hurt. I suspect if I looked around in the computer in front of me that I’m recording this on or at the objects on my desk, there are lots of materials that if I was to eat them or swallow them or ingest them in some other way would probably not do me a lot of good. But it’s highly unlikely that I’m going to start eating them so maybe we don’t need to worry about that.

HHA Shall Provide Info #2

We also need to think about the characterization of the exposure. Describing the assessment process: names of the tools or any models used; how did we estimate intensities of energy or substances at the concentrations and so on and so forth? This is one of those analyses that are particularly sensitive to the way we go about doing stuff. Indeed, in lots of jurisdictions, you will be directed as to how you should do some of these analyses and we’ll talk about that in the commentary later. So, we’ve got to include that. We’ve got to “show our working” as our teachers used to tell us when preparing us for exams.

HHA Shall Provide Info #3

We’ve got to think about severity and probability. Here the task directs us to use the standard definition tables that are found in 882. I talked about those under task 202 so I’m not going to talk about further here. Now, of course, we can, and maybe should tailor these matrices. Again, I’ve talked about that elsewhere, but if we’re not using the standard matrices and tables, then we should set out what we’ve done and why that’s appropriate as well.

HHA Shall Provide Info #4

Then finally, the mitigation strategy. We shouldn’t be doing analysis for the sake of analysis. We should be doing to say, “How can we make things better?” And in particular for health, “How can we make things acceptable?” Because health hazards very often attract absolute limits on exposure. So, questions of SFARP or ALARP or cost-benefit analysis simply may not enter into the equation. We simply may be direct to say “This is the upper limit of what you can expose a human being to. This is not negotiable.” So, that’s another important difference with this task.

Three More Topics

Now, at this point, I am just foreshadowing. We’re about to move on to talk about some different topics. First of all, in this section, we’re going to talk about three particular topics. Hazardous material or HAZMAT for short; ergonomics; and the operational environment. When we say the operational environment, it’s mainly about the people, aspects of the system, and the environment that they experience. Then after these three, we would go on to talk about radiation. There are special requirements in these three areas for HAZMAT, ergonomics, and operational environment.

HAZMAT (T207) #1

First of all, we have to deal with HAZMAT. If it’s going to appear in our system, or in the support system, we’ve got to identify the HAZMAT and characterize it. There are lots of international and national standards about how this is to be done. There’s a UN convention on hazardous materials, which most countries follow. And then there will usually be national standards as well that direct what we shall do. More on that later. So, we’ve got to think about the HAZMAT.

A word of caution on that. Certainly in Australian defence, we do HAZMAT to death because of a recent historical example of a big national scandal about people being exposed to hazardous materials while doing defence work. So, the Australian Defence Department is ultrasensitive about HAZMAT and will almost certainly mandate very onerous requirements on performing this. And whilst we might look at that go “This is nuts! This is totally over the top!” Unfortunately, we just have to get on with it because no one is going to make, I’m afraid, a sensible decision about the level of risk that we don’t have to worry about because it’s just too sensitive a topic.

So, this is one of those areas were learning from experience has actually gone a bit wrong and we now find ourselves doing far too much work looking at tiny risks. Possibly at the expense of looking at the big picture. That’s just something to bear in mind.

HAZMAT (T207) #2

So, lots of requirements for HAZMAT. In particular, we need to think about what are we going to do with it when it comes to disposal? Either disposal of consumables, worn components or final disposal of the system. And very often, the hazardous material may have become more hazardous. In that, let’s say engine or lubricating oil will probably have metal fragments in it once it’s been used and other chemical contamination, which may render it carcinogenic. So, very often we start with a material that is relatively harmless, but use – particularly over a long period of time – can alter those chemicals or introduce contaminants and make them more dangerous. So, we need to think about the full life of the system.

Ergonomics (T207) #1

Moving on to ergonomics, and this is another big topic. Now, Mil.standard 882 doesn’t address human factors, in my view, particularly well. The human factors stuff gets buried in various tasks and we don’t identify a separate human factors program with all of the interconnections that you need in order to make it fully effective. But this is one task where human factors do come in, very much so, but they are called ergonomics rather than human factors. Under this task description, we need to think about mission scenarios. We need to think about the staff who will be exposed as operators or maintainers, whatever they might be doing. We’ve got to start to characterize the population at risk.

Ergonomics (T207) #2

We’ve got to think about the physical properties of things that personnel will handle or wear and the implications that has on body weight. So, for example, there is a saying that the “Air Force and the Navy man their equipment and the army equip their men”. Apologies for the gendered language but that’s the saying. So, we’re putting human beings – very often – inside ships and planes and tanks and trucks. And we’re also asking soldiers to carry – very often – lots of heavy equipment. Their rations, their weapons, their ammunition, water, various tools and stuff that they need to survive and fight on the battlefield. And all that stuff weighs and all of that stuff, if you’re running about carrying it, bangs into the body and can hurt people. So, we need to address that stuff.

Secondly, we need to look at physical and cognitive actions that operators will take. So, this is really very broad once we get into the cognitive arena thinking about what are the operators going to be doing. And exposures to mechanical stress while performing work. So, maybe more of a focus on the maintainer in part three. Now, for all of this stuff, we need to identify characteristics of the design of the system or the design of the work that could degrade performance or increase the likelihood of erroneous action that could result in mishaps or accidents.

This is classic human factor’s stuff. How might the designed work or the designed equipment induce human error? So, that’s a huge area of study for a lot of systems and very important. And this will be typically a very large contributor to serious accidents and, in fact, accidents of all kinds. So, it should be an area of great focus. Often it is not. We just tend to focus on the so-called technical risks and overdo that while ignoring the human in the system. Or just assuming that the human will cope, which is worse.

Ergonomics (T207) #3

Continuing with ergonomics. How many staff do we need to operate and maintain the system and what demands are we placing on them? Also, if we overdo these demands, what are we going to do about that? Now, this can be a big problem in certain systems. I come from an aviation background and fatigue and crew duty time tend to be very heavily policed in aviation. But I was actually quite shocked when I sort of began looking at naval surface ships, submarines, where it seemed that fatigue and crew duty time was not well policed. In fact, there even seemed to be, in some places, quite a macho attitude to forcing the crew into working long hours. I say macho attitude because the feeling seemed to be “Well if you can’t take it, you shouldn’t have joined.”

So, it seems to be to me, quite a negative culture in those areas potentially, and it’s something that we need to think about. In particular, I’ve noticed on certain projects that you have a large crew who seem to be doing an extraordinary amount of work and becoming very fatigued. That’s concerning because, of course, you could end up with a level of fatigue where the crew might as well – they’re making mistakes to the same level as a drunk driver. So, this is something that needs to be considered carefully and given the attention it deserves.

Operating Environment #1

Moving on to the operating environment. How will these systems be used and maintained? And what does that imply for human exposure? This is another opportunity where we need to learn from legacy systems and go back and look at historical material and say “What are people being exposed to in the past? And what could happen again?”

Now, that’s important. It’s often not very systematically done. We might go and talk to a few old, bold operators and maintainers and ask their advice on the things that can go wrong but we don’t always do it very systematically. We don’t always survey past hazard and accident data in order to learn from it. Or if we do there is sometimes a tendency to say, “That happened in the past, but we will never make those mistakes. We’re far too clever to stuff up like that – like our predecessors did.” Forgetting that our predecessors were just as clever as we are and just as well –meaning as we are but they were human and so are we.

I think pride can get in the way of a lot of these analyses as well. And there may be occasions where we’re getting close to exposure limits, where regulations say we simply cannot expose people to a certain level of noise, or whatever, and then “How are we going to deal with that? How are we going to prevent people from being overexposed?” Again, this can be a problem area.

Operating Environment #2

This next bit of operating environment is really – I said about putting people in the equipment. Well, this is this bit. This is part A and B. So, we’re thinking about “If we stick people in a vehicle – whether it be a land vehicle, marine vehicle, an air vehicle, whatever it might be – what is that vehicle going to do to their bodies?” In terms of noise, of vibration and stresses like G forces, for example, and shock, shock loading? Could we expose them to blast overpressure or some other sudden changes of pressure or noise that’s going to damage their ears, temporarily or permanently? Again, remarkably easy to do. So, that’s that aspect.

Operating Environment #3

Moving on, we continue to talk about noise and vibration in general. In this particular standard, we’ve got some quite stringent guidance on what needs to be looked at. Now, these requirements, of course, are assuming a particular way of doing things, which we will come to later. There are a lot of standards reference by task 207. This task is assuming that we’re going to do things the American government or the American military way, which may not be appropriate for what we’re doing or the jurisdiction we’re in. So, we’ll just move on.

Operating Environment #4

Then again, talking about noise, blast, vibration, how are we going to do it? Some quite specific requirements in here. And again, you’ll notice, two-thirds of the way down in the paragraph, I’ve had to chop out some examples. There is some more in effect, hazard checklists in here saying we must consider X, Y, Z. Now, again, this seems to be requiring a particular way of doing things that may not be appropriate in a non-American defence environment.

However, the principle I think, to take away from this is that this is a very demanding task. If we consider human health effects properly, it’s going to require a lot of work by some very specialist and skilled people. In fact, we may even get in some specialist medical people. If you work in aviation or medicine, you may be aware that there is a specialist branch of medicine for called aviation medicine where these things are specifically considered. And similarly, there are medical specialists are a diving operations and other things where we expose human beings to strange effects. So, this can be a very, very demanding task to follow.

Operating Environment #5

So, when we’re going to equip people with protective equipment or we’re going to make engineering changes to the system to protect them, how effective are these things going to be? And given that most of these things have a finite effectiveness – they’re rarely perfect unless you can take the human out of the system entirely, then we’re going to be exposing people to some level of hazard and there will be some risk that that might cause that injury.

So, how many individuals are we going to expose per platform or over the total population exposed over the life of the system? Now, bearing in mind we’re talking sometimes about very large military systems that are in service for decades. This can be thousands and thousands of people. So, we may need to think about that and certainly in Australia, if we expose people to certain potential contaminants and noise, we may have to run a monitoring program to monitor the health and exposure of some of this exposed population or all of them. So, that can be a major task and we would need to identify the requirements to do that quite early on, hopefully.

And then, of course, again, we’re not doing this for the sake of it. How can we optimize the design and effectively reduce noise exposure and vibration exposure to humans? And how did we calculate it? How did we come to those conclusions? Because we’re going to have to keep those records for a long, long time. So, again, very demanding recording requirements for this task.

Operating Environment #6

And then I think this is the final one on operating environment. What are the limitations of this protective equipment and what burden do they impose? Because, of course, if we load people up with protective equipment that may introduce further hazards. Maybe we’re making the individual more likely to suffer a muscular musculoskeletal disorder.

Or maybe we are making them less agile or reducing their sensitivity to noise? Maybe if we give people hearing protection, if somebody else has assumed that they will hear a hazard coming, well, they’re not going to anymore, are they? If they’re wearing lots of protective equipment, they may not be as aware of the environment around them as they once were. So, we can introduce secondary hazards with some of this stuff. And then we need to look at the trade-offs. When and where? Is it better to equip people or not to equip people and limit their exposure or just keep them away altogether?

Radiation (T207)

So moving on briefly, we’re just going to talk about radiation. Now in this task – again, I’ve had to chop a lot of stuff out – you’ll see that in square brackets this task refers to certain US standards for radiation. Both ionizing and non-ionizing, lasers and so forth. That’s appropriate for the original domain, which this standard was targeted at. It may be wholly inappropriate for what you and I are doing.

So, we need to look at the principles of this task, but we may need to tailor the task substantially in order to make it appropriate for the jurisdiction we’re working in. Again, we’re going to have to keep these records for a long time. Radiation is always going to be dreaded by humans so it’s a controversial topic. We’re going to have to monitor people’s exposure and protect them and show that we have done so, potentially decades into the future. So, we should be looking for the very highest standards of documentation and recording in these areas because they will come under scrutiny.

Contracting #1

Moving onto contracting, this is more of a standard part of this task or part of the standard, I should say. These words or very similar words exist in every task. So, I’m not going to go through all of these things in any great detail. It’s worth noting, and I’ll come back to this in part B, we may need to direct whoever is doing the analyses to consider or exclude certain areas because it’s quite possible to fritter away a lot of resources doing either a wide but shallow analysis that fails to get to the things that can really hurt people.

So, we might be doing a superficial analysis or we might go overboard on a particular area and I’ve mentioned HAZMAT but there are many things that people can get overexcited about. So, we might see people spending a lot of time and effort and money in a particular area and ignoring others that can still hurt people. Even though they might be mundane, not as sexy. Maybe the analysts don’t understand them or don’t want to know. So, the customer who is paying for this may need to direct the analysis. I will come on to how you do that later.

Also the customer or client may need to specify certain sources of information, certain standards, certain exposure standards, certain assumptions, certain historical sets of data and statistics to be used. Or some statistics about the population, because, of course, for example, the military systems, the people who operate military systems tend to be quite a narrow subset of the population. So, there are very often age limits. Frontline infantry soldiers tend to be young and fit. In certain professions, you may not be allowed to work if you are colour-blind or have certain disabilities. So, it may be that a broad analysis of the general population is not appropriate for certain tasks.

It may be perfectly reasonable to assume certain things about the target population. So, we need to think about all of these things and ensure that we don’t have an unfocused analysis that as a result is ineffective or wastes a lot of money looking at things that don’t really matter, that are irrelevant.

Contracting #2

Standards and criteria. In part F, there are 29 references which the standard lists, which are all US military standards or US legal standards. Now, probably a lot of those will be inappropriate for a lot of jurisdictions and a lot of applications. So, there’s going to be quite a lot of work there to identify what are the appropriate and mandatory references and standards to use. And as I said, in the health hazard area, there are often a lot. So, we will often be quite tightly constrained on what to do.

And Part H, if the customer knows or has some idea of the staff numbers and profile, they’re going to be exposed to this system of operating and maintaining the system. That’s a very useful information and needs to be shared. We don’t want to make the analyst, the contractor, guess. We want them to use appropriate information. So, tell them and make sure you’ve done your homework, that you tell them the right thing to do.

Commentary #1

So, that’s all of the standard. I’ve got four slides now of commentary. And the first one, I just want to really summarize what we’ve talked about and think about the complexity of what we’re being asked to do. First bullet point, we are considering cradle to grave operation and maintenance and disposal. Everything associated with, potentially, quite a complex system. Now, this lines up very nicely with the requirements of Australian law, which require us to do all of this stuff. So, it’s got to be comprehensive.

Second bullet point, we’ve got to think about a lot of things. Death and injury, illness, disability, the effects on and could we infect somebody or contaminate somebody with something that will cause birth defects in their offspring? There’s a wide range of potential vectors of harm that we’re talking about here, and we will probably – for some systems, we will need to bring in some very specialist knowledge in order to do this effectively. And also thinking about reduced job performance – this is one aspect of human factors. This task is going to linking very strongly to whatever human factors program we might.

Thirdly, we’ve got to think about chemical, physical, and biological hazards. So, again, there’s a wide range of stuff to think about there. An example of that is hazmat and the requirements on hazmat are, in most jurisdictions, tend to be very stringent. So, that is going to be done and we need to be prepared to do a thorough job and demonstrate that we’ve done a thorough job and provide all the evidence.

Then we’ve also got ergonomics. Actually, strictly speaking, we’re talking human factors here because it’s a much wider definition than what the definition of ergonomics that I’m used to, which tends to be purely physical effects on a human. Because we’re talking about cognitive and perception and job performance as well and also we’ve got vibration and acoustics. So, again, particular medical effects and stringent requirements. So, a whole heap of other specialists work there.

And operating environment, thinking about the humans that will be exposed. How are we going to manage that? What do we need to specify in order to set up whatever medical monitoring program of the workforce we might have to bring in in the future through life? So, again, potentially a very big, expensive program. We need to plan that properly.

Then finally, radiation. Another controversial topic which gets lots of attention. Very stringent requirements, both in terms of exposure levels and indeed we will often be directed as to how we are to calculate and estimate stuff. It’s another specialist area and it has to be done properly and thoroughly.

Overall, every one of those seven bullet points shows how complex and how comprehensive a good health hazard analysis needs to be. So, to specify this well, to understand what is required and what is needed through life, for the program to meet our legal and regulatory obligations, this is a big task and it needs a lot of attention and potentially a lot of different specialist knowledge to make it work. I flogged that one to death, so I’ll move on.

Commentary #2

Now, as I’ve said before, too, this is an American military standard, so it’s been written to conform to that world. Now in Australia, the requirements of Australian work, health and safety are quite different to the American way of doing things. Whilst we tend to buy a lot of American equipment and there’s a lot of American-style thinking in our military and in our defence industry, actually, Australian law much is much more closely linked to English law. It’s a different legal basis to what the Americans do. So Australian practitioners take note.

It’s very easy to go down the path of following this standard and doing something that will not really meet Australian requirements. It’ll be, “We’ll do some work” and it may be very good work, but when we come to the end and we have to demonstrate compliance with Australian requirements, if we haven’t thought about and explicitly upfront, we’re probably in for a nasty shock and a lot of expensive rework that will delay the program. And that means we’re going to become very, very unpopular very quickly. So, that’s one to avoid in my experience.

So, we will need to tailor task 207 requirements upfront in order to achieve WHS compliance. And the client customer needs to do that and understand that not the – well the contractor needs to. The analysts need to understand that. But the customer needs to understand that first, otherwise, it won’t happen.

Commentary #3

Let’s talk a bit more about tailoring for WHS. For example, there are several WHS codes of practice which are relevant. And just to let you know, these codes of practice cover not only requirements of what you have to achieve, but also, to a degree, how you are to achieve them. So, they mandate certain approaches. They mandate certain exposure standards. Some of them also list a lot of other standards that are not mandated but are useful and informative.

So, we’ve got codes of practice on hazardous manual tasks so avoiding muscular-skeletal injuries. We’ve got several codes of practice on hazardous chemicals. So, we’ve got a COP specifically on risk management and risk assessment of hazardous chemicals, on safety data sheets, on labelling of HAZCHEM in a workplace. We’ve got a COP on noise and hearing loss and also, we have other COPs on specific risks, such as asbestos, electricity and others, depending on what you’re doing. So, potentially there is a lot of regulation and codes of practice that we need to follow.

And remember that COPs are, while they contain regulations, they also are a standard that a court will look to enforce if you get prosecuted. If you wind up in court, the prosecution will be asking questions to determine whether you’ve met the requirements of COP or not. If you can’t demonstrate that you’ve met them, you might have done a whole heap of work and you might be the greatest expert in the world on a certain kind of risk, but if he can’t demonstrate that you’ve met at minimum the requirements of COP – because they are minimum requirements – then you’re going to be in trouble. So, you need to be aware of what those things are.

Then on radiation, we have separate laws outside the WHS. So, we have the Australian Radiation Protection and Nuclear Safety Agency, ARPANSA, and there is an associated act and associated regulations and some COP as well. So, for radiation side, there’s a whole other world that you’ve got to be aware of and associated with all of this stuff are exposure standards.

Commentary #4

Finally, how do we do all of this without spending every dollar in the defence budget and taking 100 years to do it? Well, first of all, we need to set our scope and priorities. So, before we get to Task 207, the client/the customer should be involving end-users and doing a preliminary hazard identification exercise. That should be broad and as thorough as possible. They should also be doing a preliminary hazardous hazard analysis exercise, Task 202, to think about those hazards and risks further.

Also, you should be doing Task 203, which is system requirements hazard analysis. We need to be thinking about what are the applicable requirements for my system from the law all the way down to what specific standards? What codes of practice? What historical norms do we expect for this type of equipment? Maybe there is industry good practice on the way things are done. Maybe as we work through the specifications for the equipment, we will derive further requirements for hazard controls or a safety management system or whatever it might be. That’s a big job in itself.

So, we need to do all three of those tasks, 201, 202, 203, in order to be prepared and ready to focus on those things that we think might hurt us. Might hurt people physically, but also might hurt us in terms of the amount of effort we’re going to have to make in order to demonstrate compliance and assurance. So, that will focus our efforts.

Secondly, when we need to do the specialist analyses and we may not always need to do so. This is where 201, 202, and 203 come in. But where we need to do specialist analyses, we may need to find specialist staff who are competent to do these this kind of unusual or specialist work and do it well. Now, typically, these people are not cheap, and they tend to be in short supply. So, if you can think about this early and engage people early, then you’re going to get better support.

You’re probably going to get a better deal because in my experience if you call in the experts and ask their opinion early on, they’re more likely to come back and help you later. As opposed to, if you ignore them or disregard their advice and then ask them for help because you’re in trouble, they may just ignore you because they’ve got so much work on. They don’t need your work. They don’t need you as a client. You may find yourself high and dry without the specialists you need or you may find yourself paying through the nose to get them because you’re not a priority in their eyes. So do think about this stuff early, I would suggest and do cultivate the specialist. If you get them in early and listen to them and they feel involved, you’re much more likely to get a good service out of them.

So thirdly, try not to do huge amounts of work on stuff that doesn’t really have a credible impact on health. Now, I know that sounds like a statement of the blinking obvious, but because people get so het up about health issues, particularly things like radiation and other hazards that humans can’t see: we dread them. We get very emotional about this stuff and therefore, management tends to get very, very worried about this stuff. And I’ve seen lots of programs spend literally millions of dollars analyzing stuff to death, which really doesn’t make any difference to the safety of people in the real world. Now, obviously, that’s wasted money, but also it diverts attention from those areas that really are going to cause or could cause harm to people through the life of the system.

So, we need to use that risk matrix to understand what is the real level of risk exposure to human beings and therefore, how much money should we be spending? How much effort and priority should we be spending on analyzing this stuff? If the risk is genuinely very low, then probably we just take some standard precautions, follow industry best practice, and leave it at that and we keep our pennies for where they can really make a difference.

Now, having said that, there are some exceptions. We do need to think about accident survivability. So, what stresses are people going to be exposed to if their vehicle is an accident? How do we protect them? How do they escape afterward? Hopefully. How do we get them to safety and treat the injured? And so on and so forth. That may be a very significant thing for your system.

Also post-accident scenarios in terms of – very often a lot of hazardous materials are safely locked away inside components and systems but if the system catches fire or is smashed to pieces and then catches fire, then potentially a lot of that HAZMAT is going to become exposed. Very often materials that pose a very low level of risk, if you set them on fire and then you look at the toxic residue left behind after the fire, it becomes far more serious. So, that is something to consider. What do we do after we’ve had an accident and we need to sort of clean up the site afterward? And so on and so forth.

Again, this tends to be a very specialist job so maybe we need to get in some specialists to give us advice on that. Or we need to look to some standards if it’s a commonplace thing in our industry, as it often is. We learn we learned from bitter experience. Well, hopefully, we learn from bitter experience.

Copyright Statement

So, that’s it from me. I appreciate it’s been a long session, but this is a very complex task and I’ve really only skimmed the surface on this and pointed you at sort of further reading and maybe some principles to look at in more depth. So, all the quotations are from the Mill standard, which is copyright free. But this presentation is copyright of the Safety Artisan.

For More…

And for more information on this topic and others, and for more resources, do please visit www.safetyartisan.com. There are lots of free resources on the website as well, and there’s plenty of free videos to look at.

End: Health Hazard Analysis

So, that is the end of the session. Thank you very much for listening. And all that remains for me to say is thanks very much for supporting the work of the Safety Artisan and tuning into this video. And I wish you every success in your work now and in the future. Goodbye.

Categories
Mil-Std-882E Safety Analysis

Operating & Support Hazard Analysis

In this full-length session, The Safety Artisan looks at Operating & Support Hazard Analysis, or O&SHA, which is Task 206 in Mil-Std-882E. We explore Task 206’s aim, description, scope, and contracting requirements. We also provide value-adding commentary, which explains O&SHA: how to use it with other tasks; how to apply it effectively on different products; and some of the pitfalls to avoid. We refer to other lessons for specific tools and techniques, such as Human Factors analysis methods.

This is the seven-minute-long demo. The full version is about 35 minutes long.

Operating & Support Hazard Analysis: Topics

  • Task 206 Purpose:
    • To identify and assess hazards introduced by O&S activities and procedures;
    • To evaluate the adequacy of O&S procedures, facilities, processes, and equipment used to mitigate risks associated with identified hazards.
  • Task Description (six slides);
  • Reporting (two slides);
  • Contracting (two slides); and
  • Commentary (four slides).

Operating & Support Hazard Analysis: Transcript

Click here for the Transcript

Introduction

Hello everyone and welcome to the Safety Artisan; home of safety engineering training. I’m Simon and today we’re going to be carrying on with our series on Mil. Standard 882E system safety engineering.

Operating & Support Hazard Analysis

Today, we’re going to be moving on to the subject of operating and support hazard analysis. This is, as it says, task 206 under the standard. Operating and support hazard analysis, I’ll just call it O&S or OSHA (also O&SHA) for short. Unfortunately, that will confuse people if I call OSHA. Let’s call it O&S.

Topics for this Session

The purpose of O&S hazard analysis is to identify and assess hazards introduced by those activities and procedures and also to evaluate the adequacy of O&S procedures, processes, equipment, facilities, etc, to mitigate risks that have been already identified. A twofold task but a very big task. And as we’ll see, we’ve got lots of slides today on task description, and reporting, contracting, and commentary. As always, I present the full text as is of the task, which is copyright free, but I’m only going to talk about the things that are important. So, we’re not going to go through every little clause of the standard that would be pointless.

O&S Hazard Analysis (T206)

Let’s get started with the purpose. As we’ve already said, it’s to identify and assess those hazards which are introduced by operational and support activities and procedures and evaluate their adequacy. So, we’re looking at operating the system, whatever it may be- And of course, this is a military standard, so we assume a military system, but not all military systems are weapon systems by any means. Not all are physical systems. So, there may be inventory management systems, management information systems, all kinds of stuff. So, does operating those systems and just supporting them (maintaining them are resupplying them, disposing of them, etc.,) does that create any hazards or introduce any hazards? And how do we mitigate? That’s the purpose of the task.

Task Description (T206) #1

Let’s move on to the task description. Again, we’re assuming a contractor is performing the analysis, but that’s not necessarily the case. For this task, this actually says this typically begins during engineering and manufacturing development, or EMD.  So, we’re assuming an American style lifecycle for a big system and EMD comes after concept and requirements development. So, we are beginning to move into the very expensive stage of development for a system where we begin to commit serious money. It’s suggesting that O&SHA can wait until then which is fine in general unless you’ve identified any particularly novel hazards that will need to be dealt with earlier on. As it says, it should build on design hazard analyses, but we’ll also talk about the case later on when there is no design hazard analyses. And the O&SHA shall identify requirements or alternatives or eliminating hazards, mitigating risks, etc. This is one of those tasks where the human is very important – In fact, dominant to be honest. Both as a source of hazards and the potential victim of the associated risks. A lot of human-centric stuff going on here.

Task Description (T206) #2

As always, we’re going to think about the system configurations. We’re going to think about what we’re going to do with the system and the environment that we’re going to do it in. So, a familiar triad and I know I keep banging on about this, but this really is fundamental to bounding and therefore evaluating safety. We’ve got to know what the system is, what we’re doing with it, and the environment in which we’re doing it. Let’s move on.

Task Description (T206) #3

Again, Human Factors, regulatory requirements, and particularly specified personnel requirements need to be thought of. Particularly for operating and support, we need to take into account the staffing and personnel concept that we have. It’s frighteningly easy to produce a system that needs so much maintenance, for example, or support activity that it is unaffordable. And lots and lots of military systems and, it must be said, government and commercial systems in the past have come in that required enormous amounts of support, which soon proved to be unaffordable or no one would sign up to the commitment required. So, lots of projects have simply died because the system was going to be too expensive to sustain. That’s a key point of what we’re doing with O&S here. It’s not just about health and safety. It’s about health and safety, which is affordable.

We also need to look at unplanned events. So, not just designed in things, but things introduced- It says human errors. Again, I’m going to re-emphasize it’s erroneous human action because human error makes it sound like a human is at fault. Whereas very often it’s the design or the concept or the requirements that are at fault and place unacceptable burdens on the human being. Again, lots of messy systems seen in the past, which didn’t quite work and we just kind of expected the operator to cope. And most of the time they cope and then every so often they have a bad day at the office or a bunch of factors come together and lots of people die. And then we blame the human. Well, it’s not the human’s fault at all. We put them in that position. And as always, we need to look at past- Past evaluations of related legacy systems and support operations. If you have good data about legacy systems or about similar systems that your organization or another organization has operated, then that’s gold dust. So, do make an effort to get hold of that information if you can. Maybe a trade association or some wider pan organization body can help you there.

Task Description (T206) #4

At a minimum, we’ve got to identify activities involving known hazards. This assumes that we’ve done some hazard analysis in the past, which is very important. We always need to do that. I’ll come back to that commentary. Secondly, changes needed in requirements, be they functional requirements – what we want the system to do. Or design requirements, if we put constraints on how the system may do it for whatever it may be, hardware, software, support equipment, whatever to make those hazards and risks more manageable. Requirements for safety features – so requirements for engineered features and devices, equipment, because always, in almost any jurisdiction, we will have a hierarchy of control that recognizes that designed and engineered in safety features are more effective than just relying on people to get it right. And then we’ve also got to communicate to people the hazards associated with the system. Warnings, cautions, and whatever special emergency procedures might be required associated with the system. Again, that’s something that we see reinforced in law and regulations in many parts of the world. This is all good stuff. It’s accepted good practice all across the world.

Task Description (T206) #5

Moving on, we also need to think about how are we going to move the system around and the associated spares and supplies? How are we going to package them, handle them, stole them, transport them? Particularly if there are hazardous materials, etc, etc, involved. That’s the next part, G. Again, training requirements. We’re thinking about a human-centric approach. Whatever we expect people to do, they’ve got to be trained in how to do it. Point I, we’ve got to include everything, whether it’s developmental or non-developmental terms. We can’t just ignore stuff because it’s GFE or it’s off the shelf. It doesn’t mean it can never go wrong. Far from it. Particularly if we are putting stuff together that’s never been put together before in a novel combination or in a novel environment. Something that might be perfectly safe and stable in an air-conditioned office might start to do odd things in a much more corrosive and uncontrolled environment, let’s say.

We need to think about what modes might the system be potentially hazardous when under operative control. Particularly, we might think about degraded modes of operation. So, for whatever reason, a part of the system has gone wrong or the system has got into an operating environment within which it doesn’t operate as well as it could. It’s not in an optimal operating environment or state. The human being in control of it, we’re assuming, has still got to be able to operate the system, even if it’s only to shut it down or to get it back into a safer state or safer environment. We’ve got to think about all of those nuances.

Then because we’re talking about support as well, we need to think about a related legacy systems, facilities and processes which may provide background information. Also, of course, the system presumably will very often be operating alongside other systems or it will be supported by all systems maybe that exist or being procured separately. So, we’ve got to think about all those interactions as well and all those potential contributions. As you can see, this is quite a wide-ranging broadly scoped task.

Task Description (T206) #6

Finally, on this section, the customer/the end-user/or whoever may specify some specific analysis techniques. Very often they will not. So, whoever is doing the analysis, be they a contractor or third party outside agency, needs to make sure that whatever they propose to do is going to be acceptable to the program manager. In the sense that it is going to be compatible and relevant and useful. And then finally, the contractor has got to do some O&SHA at the appropriate time but maybe more detailed data will come along later. In which case that needs to be incorporated and also operational changes.

An absolute classic [situation] with military and non-military systems is; the system gets designed, it goes into test and evaluation and we discover that things- assumptions that were made during development- don’t actually hold up. The real world isn’t like that or whatever it might be and we find we’re making changes- making changes in assumptions. Those need to be factored in which, sadly, is often not done very well. So, that’s an important point to think about. What’s my change control mechanism and how will the people doing the and O&SHA find out about these changes? Because very often it’s easy to assume that everybody knows about this stuff but when you start making assumptions, the truth is that it very often goes adrift.

Reporting (T206) #1

Let’s talk about reporting- Just a couple of slides here. In the reporting, there’s some fairly standard stuff in here, the physical and functional characteristics of the system- that’s important. Again, we might assume that everybody knows what they are, but it’s important to put them in. It may be that the people doing the analysis were given a different system description to the people developing the system, to the people doing the personnel planning, etc. All the different things that have to be brought together, we need to make sure that they join up again. It’s too easy to get that wrong. Reinforcing the point I made on the previous slide, as more detailed descriptions and specifications come in that needs to be supplied when it becomes available and provided.

Hazard analysis methods and techniques. What techniques are we using? Give a description. If you’re doing it to a particular standard, so much the better. Great- that saves a lot of paper. What assumptions that we made? What data, both qualitative and quantitative have we used to support analysis? That all needs to be declared. By the way, one of the reasons is to be declared is that when things change- not if- that’s when these assumptions and the data and the techniques get exposed. So, if there are changes, if we don’t have this kind of information declared, we can’t assess the impact changes. And it gets even more difficult to keep up with what’s going on.

Reporting (T206) #2

And then hazard analysis results. Again, the leading particulars of the results should be recorded in the hazard tracking system, the HTS, or hazard log, or risk register- whatever you want to call it. But there will be more detailed information that we wouldn’t want to clutter up the risk register with and we also need to provide warnings, cautions, and procedures to be included in maintenance manuals, training courses, operator manuals, etc. So, we’re going to or we’re probably going to generate an awful lot of data out of this task and that needs to be provided in a suitable format. Again, whoever the program manager on the client-side, or is the end-user representation, needs to think about this stuff quite early on.

Contracting #1

That leads us neatly on to contracting. Now, this task, in theory, can be specified a little bit down the track, after the program started. In practice, what you find is program managers tried to specify everything up front in a single contract for various reasons.

There are good reasons for doing that sometimes. Also, there are bad reasons but I’m not going to talk about this session. We’ll have a talk about planning your system safety program in another session. There’s a lot of nuances in there to be considered.

Just sticking to this task, identification of functional disciplines – who do we need to get involved in order to do this work properly? It’s likely that the safety team if you have one, may not have relevant operating experience or relevant sustainment experience for this kind of system. If they do, that’s fantastic but that doesn’t negate the read the requirement to get the end-user represented and involved. In fact, that’s a near legal requirement in Australia, for example, and in some other jurisdictions. We need to get the end-users involved. We need the discipline specialist to get involved. Typically, your integrated logistic support team, your reliability people, your maintainability, and your testability people, if you have those disciplines. Or maybe you’re calling them something else, it doesn’t really matter.

We need to know what are the reporting requirements. What, if any, analysis methods and techniques do we desire to be used? Maybe the client or end-user has got to jump through some regulatory hoops and therefore they need specific analysis work and safety results to be done and produced. If that’s the case, then that needs to be specified in the contract. And what data is to be generated in what format? And how is it to be reported on when, etc? Considering the hazard tracking system, etc? And then the client may also select or specify known hazards, known hazardous areas, or other specific items to be examined or excluded because maybe it’s being covered elsewhere or we don’t expect the contractor to be able to do this stuff. Maybe we need to use a specialist organization. Again, maybe a regulator has directed us to do so. So, all of these things need to be thought about when we’re putting together the contract requirements for task 206.

Contracting #2

Again, I say this every time, we need to include all items within the scope of the system and the environment, not just developmental stuff. In fact, these days, maybe the majority of programs that I am seeing are mostly non-developmental. So, we’re taking lots of COTS stuff, GFE components, and putting it all together. That’s all going to be included, particularly integration.

We need to think about legacy and related processes and the hazard analysis associated with them if we can get them. They should be supplied to whoever is doing the work and an analyst should be directed to review them and include lessons learned.

Then, reinforcing the previous point that has a tracking system- How will information reported in this task be correlated with tasks and analyses that are being done maybe elsewhere or by different teams? And the example here is 207 health hazard analysis. I’ll talk a little bit about the linkages between the two later. But it’s quite likely in this sort of area there will be large groups of people thinking about operations and maintenance and support. Very often those groups are very different. Sometimes they don’t even talk to each other. That’s the culture in different organizations. You don’t see airline pilots hanging around with baggage handlers very much, do you, down the pub for whatever reason? Different set of people- they don’t always mix very much. And again, you may also have different specialist disciplines, especially the Human Factors people. Again, you’ve got to tie everything in there. So, there’s going to be lots of interfaces in this kind of task that they’ve got to be managed.

Point I – concept of operations. Yes, that’s in every task. You’ve got to understand what we intend to do with this system or what the end-user intends to do with the system in order to have some context for the analysis.

And then finally, what risk definitions and what risk matrix are we using? If we’re not using the standard 882 matrix, then what are we doing?

Commentary #1

I’ve got four slides of commentary now – a number of things to say about Task 206.

Now, I’ve picked an Australian example. So, Task 206 ties in very neatly with Australian WHS requirements. I suspect Australian WHS requirements have been strongly influenced by American OSHA and system safety practices. In Australia, we are heavily influenced by the US approach. This standard and legal requirements in Australia, and in many other states and territories let’s be honest, do tie in nicely with the standard. Although not always perfectly, you’ve got to remember that. So, we do need to focus on operations and support activities. That’s a big part of WHS, thinking about all relevant activities and cradle to grave – the whole life of the system. We need to think about the working environment, the workplace. We need to think about humans as an integral part of the system, be they operators or maintainers, suppliers, other kinds of sustainers. And we need to be providing relevant information on hazards, risks, warnings, trainings, and procedures, and requirements for PPE, and so on and so forth to workers.

So, task 206 is going to be absolutely vital to achieving WHS compliance in Australia and compliance with health and safety legislation and regulations in many parts of the world. In the US and UK and I would say in virtually all developed nations. So, this is a very important task for achieving compliance with the law and regulations. It needs to get the requisite amount of attention- It doesn’t always. People so often on a program during procurement and acquisition development, the technical system is the sexy thing. That’s the thing that gets all the attention, especially early on. The operating and particularly the support side tends to get neglected because it’s not so sexy. We don’t buy a system to support it after all do we? We buy a system to do a job. So, we get the operators in and we get their input on how to optimize the system to do the job most cost-effectively and with most mission effectiveness that we can get out of it. We don’t often think about support effectiveness. But to achieve WHS compliance or the equivalent this is a very important task so we will almost always need to do it.

Commentary #2

The second item to think about – what is going to be key for the maintenance support side is a technique called Job Safety Analysis or Job Hazard Analysis. I’ve highlighted a couple of sources of information there, particularly I would recommend going to the American www.OSHA.gov site and the guidance that they provide on how to do a job hazard analysis. So, use that or use something else if something different is specified in the jurisdiction you’re working it, then go ahead and use that. But if you don’t have any [guidance] on what to do, this will help you.

This is all about – I’ve got a task to do, whatever it might be doing, how do I do it? Let’s analyse this step-by-step, or at least in reasonable size chunks, thinking about how we do the tasks that need to be done. Now, there’s the operator side, and then, of course, we’re always dealing with human beings working on the system or working with the system. So, we’re going to be seeing potentially a lot of Human Factors type techniques being relevant. And there are lots of tasks that we can think about, Hierarchical Task Analysis and that kind of approach is going to fit in with the Job Hazard Analysis as well. Those are going to link together quite well. There will also be things like workload analysis. Particularly for the operators, if we’re asking the operator to do a lot and to maintain a particular level of concentration or respond rapidly, we need to think about workload and too much workload and too little workload can make things worse.

There are lots of techniques out there, I’m not going to talk about Human Factors here. I’m going to be putting on a series on Human Factors techniques in cooperation with a specialist in that area. So, I’m not going to say more here.

For certain kinds of operators, let’s say, pilots, people navigating a ship and so on, drivers, there will be well-established ways that those operators are trained the way they have to operate. There will often be a legal framework and a regulatory framework that says how they have to operate. And then that may direct a particular kind of analysis to be done or a particular approach to be taken for how operators do their jobs. But equally, there is a vast range of operator roles in industry, in chemical plants. Various specialist operating roles where there’s an industry-specific approach to doing things. Or indeed the general approach may be left up to whoever is developing the system. So, there’s a huge range of approaches here that are going to be largely dictated by the concept of operations and also an awareness of what is relevant law, regulation, and good practice in a particular industry, in a particular situation. That’s where doing your Task 203, your safety requirements analysis really kicks in. It’s a very broad subject we’re covering here. You’ve got to get the specialist in to do it well.

Contracting #3

Now, I mention that these days we’re seeing more and more legacy and COTS systems being used and repurposed. Partly to save time and money. We’re not developing mega systems as often as we used to, particularly in defence, but also in many other walks of life as well. So, we may find ourselves evaluating a system where very little technical hazard analysis has been done because there are no developmental items and it’s even difficult to do analysis on legacy or a COTS system because we cannot get the data to do so. Perhaps we can’t get the data for commercial reasons, contractual reasons.

Or maybe we’ve got a legacy system that was developed in a different jurisdiction and whatever information is available with it just doesn’t fit the jurisdictional regulatory system that we’ve got to work in where we want to operate the system. This is very common. Australia, for example, [acquires] a lot of systems from abroad, which have not been developed in line with how we normally do things.

We could in theory just do Task 206 if there was no developmental hazard analysis to do but that’s not quite true. At a minimum, we will always need to do some Preliminary Hazard Listing and hazard analysis – that’s Tasks 201 and 202 respectively. And we will very definitely need to do some System Requirements Hazard Analysis, Task 203, to understand what we need to do for a particular system in a particular application, operating environment, and regulatory jurisdiction. So, we’re always going to have to do those and we may well have to look at the integration of COTS things and do some system-level analysis. That’s 204. We’re definitely going to need to do the early analyses. In fact, the client and the end-user representatives should be doing 201, 202 and 203 and then we may be in a position to finish things off with 206 for certain systems.

Contracting #4

Now, having said that, I’ve mentioned already that Task 206 can be very broad in scope and very wide-ranging. There’s a danger that we will turn Task 206 into a bottomless pit into which we pour money and effort and time without end. So, for most systems, we cannot afford to just do O&SHA across the board without any discernment or any prioritization.

So, we need to look at those other hazard analyses and prioritize those areas where people could get hurt. Particularly we should be using legacy and historical data here to say “What does – in reality, what does hurt people when looking after these systems or operating systems?” Again, as I’ve said before, in many industries there is a standard industry approach or good practice to how certain systems are operated, and maintained, and supported. So, if there is a standard industry approach available – particularly if we can justify that by available historical data – if that [is as good] as doing analysis, then why not just use the standard approach? It’s going to be easier to make a SFARP or a ALARP argument that way anyway. And why spend the money on analysis when we don’t have to? We could just spend the money on actually making the system safer. So, let’s not do analysis for the sake of doing analysis.

Also, there’s a strong synergy between the later tasks in the 200 series. There’s a strong linkage between this Task 206 and 207, which is Health Hazard Analysis. Also, there can be a strong linkage between Task 210, which is the Environmental Hazard Analysis. So, this trio of tasks focuses on the impact on living things, whether they be human beings or animals and plants and ecosystems and very often there’s a lot of overlap between them. For example, hazardous chemicals that are dangerous for humans are often dangerous for animals and plants and watercourses and so on and so forth. I’ll be talking about that more in the next session on Task 207.

One word of warning, however. Certainly, in Australia, we have got fixated on hazardous chemicals because we’ve had some very high-profile scandals involving HAZCHEM in the past. Now, there’s nothing wrong, of course, with learning from experience and applying rigorous standards when we know things have gone wrong in the past. But sometimes we go into a mindset of analysis for analysis sake. Dare I say, to cover people’s backsides rather than to do something useful. So, we need to focus on whether the presence of a HAZCHEM could be a problem. Whether people get exposed to it, not just that it’s there.

Certain chemicals may be quite benign in certain circumstances, and they only become dangerous after an emergency, for example. There are lots of things in the system that are perfectly safe until the system catches fire. Then when you’re trying to dispose or repair a fire damage system that can be very dangerous, for example. So, we need to be sensible about how we go about these things. Anyway, more on that in the next session.

Copyright Statement

That’s the commentary that I have on Task 206. As we said, it links very tightly with other things and we will talk about those in later sessions. I just like to point out that the “italic text in quotations” is from the Mil. standard. That is copyright free as most American government standards are. However, this presentation and my commentary, etc. are copyright of the Safety Artisan 2020.

For More …

Now, for all lessons and resources, please do visit the www.safetyartisan.com. Now, as you’ll notice, it’s an https – it’s a secure website.

End: Operating & Support Hazard Analysis

So, that is the end of the lesson and it just remains for me to say thank you very much for your time and for listening. And I look forward to seeing you again soon. Cheers.