Categories
Functional Safety Start Here

Functional Safety

The following is a short, but excellent, introduction to the topic of ‘Functional Safety’ by the United Kingdom Health and Safety Executive (UK HSE). It is equally applicable outside the UK, and the British Standards (‘BS EN’) are versions of international ISO/IEC standards – the Australian version (‘AS/NZS’) is often identical to the British standard.

My comments and explanations are shown [thus].

[Functional Safety]

“Functional safety is the part of the overall safety of plant and equipment that depends on the correct functioning of safety-related systems and other risk reduction measures such as safety instrumented systems (SIS), alarm systems and basic process control systems (BPCS).

[Functional Safety is popular, in fact almost ubiquitous, in the process industry, where large amounts of flammable liquids and gasses are handled. That said, the systems and techniques developed by and for the process industry have been so successful that they are found in many other industrial, transport and defence applications.]

SIS [Safety Instrumented Systems]

SIS are instrumented systems that provide a significant level of risk reduction against accident hazards.  They typically consist of sensors and logic functions that detect a dangerous condition and final elements, such as valves, that are manipulated to achieve a safe state.

The general benchmark of good practice is BS EN 61508, Functional safety of electrical/electronic/programmable electronic safety related systems. BS EN 61508 has been used as the basis for application-specific standards such as:

  • BS EN 61511: process industry
  • BS EN 62061: machinery
  • BS EN 61513: nuclear power plants

BS EN 61511, Functional safety – Safety instrumented systems for the process industry sector, is the benchmark standard for the management of functional safety in the process industries. It defines the safety lifecycle and describes how functional safety should be managed throughout that lifecycle. It sets out many engineering and management requirements, however, the key principles of the safety lifecycle are to:

  • use hazard and risk assessment to identify requirements for risk reduction
  • allocate risk reduction to SIS or to other risk reduction measures (including instrumented systems providing safety functions of low / undefined safety integrity)
  • specify the required function, integrity and other requirements of the SIS
  • design and implement the SIS to satisfy the safety requirements specification
  • install, commission and validate the SIS
  • operate, maintain and periodically proof-test the SIS
  • manage modifications to the SIS
  • decommission the SIS

BS EN 61511 also defines requirements for management processes (plan, assess, verify, monitor and audit) and for the competence of people and organisations engaged in functional safety.  An important management process is Functional Safety Assessment (FSA) which is used to make a judgement as to the functional safety and safety integrity achieved by the safety instrumented system.

Alarm Systems

Alarm systems are instrumented systems designed to notify an operator that a process is moving out of its normal operating envelope to allow them to take corrective action.  Where these systems reduce the risk of accidents, they need to be designed to good practice requirements considering both the E,C&I design and human factors issues to ensure they provide the necessary risk reduction.

In certain limited cases, alarm systems may provide significant accident risk reduction, where they also might be considered as a SIS. The general benchmark of good practice for management of alarm systems is BS EN 62682.

BPCS [Basic Process Control Systems]

BPCS are instrumented systems that provide the normal, everyday control of the process.  They typically consist of field instrumentation such as sensors and control elements like valves which are connected to a control system, interfaced and could be operated by a plant operator.  A control system may consist of simple electronic devices like relays or complicated programmable systems like DCS (Distributed Control System) or PLCs (Programmable Logic Controllers).

BPCS are normally designed for flexible and complex operation and to maximise production rather than to prevent accidents.  However, it is often their failure that can lead to accidents and therefore they should be designed to good practice requirements. The general benchmark of good practice for instrumentation in process control systems is BS 6739.”

[To be honest, I would have put this the other way around. The BCPS came first, although they were just called ‘control systems’, and some had alarms to get the operators’ attention. As the complexity of these control systems increased, then cascading alarms became a problem and alarms had to be managed as a ‘thing’. Finally, the process industry used additional systems, when the control system/alarm system combo became inadequate, and thus the terms SIS and BCPS were born.]

[It’s worth noting that for very rapid processes where a human either cannot exercise control fast enough or lacks the data to do so reliably enough, the SIS becomes an automatic protection system, as found in rail signalling systems, or ‘autonomous’ vehicles. Also for domains where there is no ‘fail-safe’ state, for example in aircraft flight control systems, the tendency has been to engineer multiple, redundant, high-integrity control systems, rather than use a BCPS/SIS combo.]

Copyright

The above text is reproduced under Creative Commons Licence from the UK HSE’s webpage. The Safety Artisan complies with such licensing conditions in full – for details see here.

[Functional Safety – END]

Back to Home Page

Categories
Start Here

Preliminary Hazard Identification & Analysis, PHIA Guide

Get your free Preliminary Hazard Identification & Analysis, PHIA Guide here!

Introduction

Hazard Identification is sometimes defined as: “The process of identifying and listing the hazards and accidents associated with a system.”

Hazard Analysis is sometimes defined as: “The process of describing in detail the hazards and accidents associated with a system and defining accident sequences.”

Preliminary Hazard Identification and Analysis (PHIA) helps you determine the scope of safety activities and requirements. You can identify the main hazards likely to arise from the capability and functionality being provided. Perform it as early as possible in the project life cycle. Thus, you will provide important early input to setting Safety requirements and refining the Project Safety Plan.

PHIA seeks to answer, at an early stage of the project, the question: “What Hazards and Accidents might affect this system and how could they happen?”

Aim

The aim of the PHIA is to identify, as early as possible, the main Hazards and Accidents that may arise during the life of the system. It provides input to:

  1. Scoping the subsequent Safety activities required in any Safety Plan. A successful PHIA will help to gauge the proportionate effort that is likely to be required to produce an effective Safety Case, proportionate to risks.
  2. Selecting or eliminating options for subsequent assessment.
  3. Setting the initial Safety requirements and criteria.
  4. Subsequent Hazard Analyses.
  5. Initiate Hazard Log.

Description

Perform a PHIA as early as possible in order to obtain maximum benefit. Use it to understand what the Hazards and Accidents are, why, and how they might be realized. A PHIA is an important part of Risk Management, project planning, and requirements definition. It helps you to identify the main system hazards and helps target where a more thorough analysis should be undertaken.

Usually, PHIA is based on a structured brainstorming exercise, supported by hazard checklists. A structured approach helps to minimize the possibility of missing an important hazard. It also demonstrates that a thorough and comprehensive approach has been applied.

Get Your Free PHIA Guide Here!

Find more on basic safety topics at Start Here.

Categories
Cybersecurity Start Here

My CISSP Exam Journey

Here is a video about my CISSP exam journey.

I’ve just passed the Certified Information Systems Security Professional (CISSP) Exam…

I’ve just passed the Certified Information Systems Security Professional (CISSP) Exam, which was significantly updated on 1st May 2021. In this 30-minute video I will cover:

  • The official CISSP course and course guide;
  • The 8 Domains of CISSP, and how to take stock of your knowledge of them;
  • The official practice questions and the Study Guide;
  • The CISSP Exam itself; and
  • Lessons learned from my journey.

I wish you every success in your CISSP journey: it’s tough, but you can do it!

To get a full course on what’s new in all eight Domains of the CISSP Exam outline Click Here.

Transcript: My CISSP Exam Journey

The transcript will be added when available…

To get regular updates from The Safety Artisan, Click Here. For more introductory lessons then Start Here.

Categories
Start Here

Risk Management 101

Welcome to Risk Management 101, where we’re going to go through these basic concepts of risk management. We’re going to break it down into the constituent parts and then we’re going to build it up again and show you how it’s done. I’ve been involved in risk management, in project risk management, safety risk management, etc., for a long, long time.  I hope that I can put my experience to good use, helping you in whatever you want to do with this information.

Maybe you’re getting an interview. Maybe you want to learn some basics and decide whether you want to know more about risk management or not.  Whatever it might be, I think you’ll find this short session really useful. I hope you enjoy it and thanks for watching.

Welcome to Risk Management 101, where we’re going to…

Risk Management 101, Topics

  • Hazard Identification;
  • Hazard Analysis;
  • Risk Estimation;
  • Risk [and ALARP] Evaluation;
  • Risk Reduction; and
  • Risk Acceptance.

Risk Management 101, Transcript

Click here for the full transcript:

Introduction

Hi everyone and welcome to Risk Management 101. We’re going to go through these basic concepts of risk management. We’re going to break it down into the constituent parts. Then we’re going to build it up again and show you how it’s done.

My name is Simon Di Nucci and I have a lot of experience working in risk management, project risk management, safety risk management, etc.  I’m hoping that I can put my experience to good use, helping you in whatever you want to do with this information. Whether you’re going for an interview or you want to learn some basics. You can watch this video and decide if you want to know more about risk management or you don’t need to.  Whatever it might be, you’ll find this short session useful. I hope you enjoy it and thanks for watching.

Topics For This Session

Risk Management 101. So what does it all mean? We’re going to break risk management down into we’ve got six constituent parts. I’m using a particular standard that breaks it down this way. Other standards will do this in different ways. We’ll talk about that later. Here we’ve got risk management broken down in to; hazard identification, hazard analysis, risk estimation, risk evaluation (and ALARP), risk reduction, and risk acceptance.

Risk Management

Let’s get right on to that. Risk management – what is it? It’s defined as “the systematic application of management policies, procedures and practises to the tasks of hazard identification, hazard analysis, risk estimation, risk and ALARP evaluation, risk reduction, and risk acceptance”.

There are a couple of things to note here. We’re talking about management policies, procedures and practices. The ‘how’ we do it. Whether it’s a high-level policy or low-level common practice. E.g. how things are done in our organisation vs how the day-to-day tasks are done? And it’s also worth saying that when we talk about ‘hazards’, that’s a safety ‘ism’. If we were doing security risk management, we can be talking about ‘threats’. We can also be talking about ‘causes’ in day-to-day language. So, we can be talking about something causing a risk or leading to a risk. More on that later, but that’s an overview of what risk management is.

Part 1

Let’s look at it in a different way. For those of you who like a visual representation, here is a graph of the hierarchical breakdown. They need to happen in order, more-or-less, left to right. And as you can see, there’s a link between risk evaluation and risk reduction. We’ll come on to that. So, it’s not ‘or’ it’s a serial ‘this is what you have to do’. Sometimes they’re linked together more intimately.

Hazard Identification

First of all, hazard identification. So, this is the process where we identify and list hazards and accidents associated with the system. You may notice that some words here are in bold. Where a word is in bold, we are going to give the definition of what it is later.

These hazards could lead to an accident but only associated with the system. That’s the scope. If we were talking about a system that was an aeroplane, or a ship, or a computer, we would have a very different scope. There would also be a different way that maybe accidents would happen.

On a more practical level, how do we do hazard identification? I’m not going to go into any depth here, but there are certain classic ones. We can consult with our workers and inspect the workplace where they’re operating. And in some countries, that’s a legal requirement (Including in Australia where I live). Another option is we can look at historical data. And indeed, in some countries and in some industries, that’s a requirement. A requirement means we have to do that. And we can use special analysis techniques. Now, I’m not going to talk about any of those analysis techniques today. You can watch some other sessions on The Safety Artisan to see that.

Hazard Analysis

Having done hazard identification, we’ve asked ourselves ‘What could go wrong?’. We can put some more detail on and ask, ‘How could it go wrong? And how often?’. That kind of stuff. So, we want to go into more detail about the hazards and accidents associated with this particular system. And that will help us to define some accident sequences. We can start with something that creates a hazard and then the hazard may lead to an accident. And that’s what we’re talking about. We will show that using graphics late, which will be helpful.

But again, more on terminology. In different industries, we call it different things. We tend to say ‘accident’ in the UK and Australia. In the U.S., they might call it a ‘mishap’, which is trying to get away from the idea that something was accidental. Nobody meant it to happen. Mishap is a more generic term that avoids that implication. We also talk about ‘losses’ or we talk about ‘breaches’ in the security world. We have some issue where somebody has been able to get in somewhere that they should not. And we can talk about accident sequences. Or, in a more common language, we call it a sequence of events. That’s all it is.

Risk Estimation

Now we’re talking about the risk estimation. We’ve thought about our hazards and accidents and how they might progress from one to another. Let’s think about, ‘How big is the risk of this actually happening?’. Again, we’ll unpack this further later at the next level. But for now, we’re going to talk about the systematic use of available information. Systematic- so, ordered. We’re following a process. This isn’t somebody on their own taking a subjective view ‘Look, I think it’s not that’. It’s a process that is repeatable. We want to do something systematic. It’s thorough, it’s repeatable, and so it’s defendable. We can justify the conclusions that we’ve come to because we’ve done it with some rigour. We’ve done it in a systematic way. That’s important. Particularly if we’re talking about harm coming to people or big losses.

Risk and ALARP Evaluation

Now, risk evaluation is just taking that estimated risk just now and comparing it to something and saying, “How serious is this risk?”. Is it something that is very low? If it’s very insignificant then we’re not bothered about it. We can live with it. We can accept it. Or is it bigger than that? Do we need to do something more about it? Again, we want to be systematic. We want to determine whether risk reduction is necessary. Is this acceptable as it is or is it too high and we need to reduce it? That’s the core of risk evaluation.

In this UK-based standard – we’re using terminology is found in different forms around the world. But in the UK, they talk about ‘tolerability’. We’re talking about the absolute level of risk. There probably is an upper limit that’s allowed in the law or in our industry. And there’s a lower limit that we’re aiming for. In an ideal world, we’d like all our risks to be low-level risks. That would be terrific.

So, that’s ‘tolerability’. And you might hear it called different things. And then within the UK system, there’re three classes of ‘tolerability’ at risk. We could say it’s either ‘broadly acceptable’- it’s very low. It’s down in the target region where we like to get all our risks. It’s ‘tolerable’- we can expose people to this risk or we can live with this risk, but only if we’ve met certain other criteria. And then there’s the risk that it’s so big. It’s so far up there, we can’t do that. We can’t have that under any circumstances. It’s unacceptable. You can imagine a traffic light system where we have categorised our risk.

And then there’s the test of whether our risk can be accepted in the UK. It’s called ALARP. We reduce the risk As Low As Reasonably Practicable. And in other places, you’ll see SFARP. We’ve eliminated or minimised the risk So Far As Is Reasonably Practicable. In the nuclear industry, they talk about ALARA: As Low As Reasonably Achievable. And then different laws use different tests. Whichever one you use, there’s a test that we have got to use to say, “Can we accept the risk?” “Have we done enough risk reduction?”. And whatever you’ve put in those square brackets, that’s the test that you’re using. And that will vary from jurisdiction to jurisdiction. The basic concept of risk evaluation is estimating the level of risk. Then compare it to some standard or some regulation. Whatever one it might be, that’s what we do. That’s risk evaluation.

Risk Reduction

We’ve asked, “Do we need to reduce risk further?”. And if we do, we need to do some risk reduction. Again, we’re being systematic. This is not some subjective thing where we go “I have done some stuff, it’ll be alright. That’s enough.”. We’re being a bit more rigorous than that. We’ve got a systematic process for reducing risk. And in many parts of the world, we’re directed to do things in a certain way.

This is an illustration from an Australian regulation. In this regulation, we’re aiming to eliminate risk. We want to start with the most effective risk reduction measures. Elimination is “We’ve reduced the risk to zero”. That would be lovely if we could do that but we can’t always do that.

What’s the next level? We could get rid of this risk by substituting something less risky. Imagine we’ve got a combustion engine powering something. The combustion engine needs flammable fuel and it produces toxic fumes. It could release carbon monoxide and CO2 and other things that we don’t want. We ask, “Can we get rid of that?”. Could we have an electric motor instead and have a battery instead? That might be a lot safer than the combustion engine. That is a substitution. There are still risks with electricity. But by doing this we’ve substituted something risky for something less risky.

Or we could isolate the hazard. Let’s use the combustion engine as an example again. We can say, “I’ll put that in the fuel and the exhaust somewhere, a long way from people”. Then it’ll be a long way from where it can do harm or cause a loss.” And that’s another way of dealing with it.

Or we could say, “I’m going to reduce the risks through engineering controls”. We could put in something engineered. For example, we can put in a smoke detector. A very simple, therefore highly reliable, device. It’s certainly more reliable than a human. You can install one that can detect some noxious gases. It’s also good if it’s a carbon monoxide detector. Humans cannot detect carbon monoxide at all. (Except if you’ve got carbon monoxide poisoning, you’ll know about it. Carbon monoxide poisoning gives you terrible headaches and other symptoms.) But of course, that’s not a good way to detect that you’re breathing in poisonous gas. We do not want to do it that way.

So, we can have an engineering control to protect people. Or we can an interlock. We can isolate things in a building or behind a wall or whatever. And if somebody opens the door, then that forces the thing to cut out so it’s no longer dangerous. There are different things for engineering controls that we can introduce. They do not rely on people. They work regardless of what any person does.

Next on the list, we could reduce exposure to the hazard by using administrative controls. That’s giving somebody some rules to follow a procedure. “Do this. Don’t do that.” Now, that’s all good. We can give people warning signs and warn people not to approach something. But, of course, sometimes people break the rules for good reasons. Maybe they don’t understand. Maybe they don’t know the danger. Maybe they’ve got to do something or maybe the procedure that we’ve given them doesn’t work very well. It’s too difficult to get the job done, so people cut corners. So, procedural protection can be weak. And a bit hit and miss sometimes.

And then finally, we can give people personal protective equipment. We can give them some eye protection. I’m wearing glasses because I’m short-sighted. But you can get some goggles to protect your eyes from damage. Damage like splashes, flying fragments, sparks, etc. We can have a hard hat so that if we’re on a building site and something drops from above on us that protects the old brain box. It won’t stop the accident from happening, but it will help reduce the severity of the accident. That’s the least effective. We’re doing nothing to prevent the accident from happening. We’re reducing the severity in certain circumstances. For example, if you drop a ton of bricks on me, it doesn’t matter whether I’m wearing a hard hat or not. I’m still going to get crushed. But with one brick, I should be able to survive that if I’m wearing a hard hat.

Risk Acceptance

Let’s move on to risk acceptance. At some stage, if we have reduced the risk to a point where we can accept it. We can live with it and we’ve decided that we’re going to need to do whatever it is that is exposing us to the risk. We need to use the system. We want to get in our car to enable us to go from a to b quickly and independently. So, we’re going to accept the risk of driving in our car. We’ve decided we’re going to do that. We make risk acceptance decisions every day, often without thinking about it. We get in a car every day on average and we don’t worry about the risk, but it’s always there. We’ve just decided to accept it.

But in this example we’ve got, it’s not an individual deciding to do something on the spur of the moment. Nor is it based on personal experience. We’ve got a systematic process where a bunch of people come together. The relevant stakeholders agree that a risk has been assessed or has been estimated and has been evaluated. They agree that the risk reduction is good enough and that we will accept that risk. There’s a bit more to it than you and I saying, “That’ll be alright.”

Part 2

Let’s summarise where we’ve got to. We’ve talked about these six components of risk management. That’s terrific. And as you can see, they all go together. Risk evaluation and risk reduction are more tightly coupled. That’s because when we do some risk reduction, we then re-evaluate the risk. We ask ‘Can we accept it?’. If the answer is ‘No.’ we need to do some more work. Then we do some more risk reduction. So those tend to be a bit more coupled together at the end. That’s the level we’ve got to. We’re now going to go to the next level.

So, we’re going to explain these things. We’ve talked about hazard identification and hazard analysis, but what is a hazard? And what is an accident? And what is an accident sequence? We’re going to unpack that a bit more. We’re going to take it to the next level. And throughout this, we’re talking about risk over and over again. Well, what is ‘risk’? We’re going to unpack that to the next level as well. It all comes down to this anyway. This is a safety standard. We’re talking about harm to people. How likely is that harm and how severe might it be? But it might be something else. It might be a loss or a security breach. It might be a financial loss. It might be a negative result for our project. We might find ourselves running late. Or we’re running over budget. Or we’re failing to meet quality requirements. Or we’re failing to deliver the full functionality that we said we would. Whatever it might be.

Hazard

So, let’s unpack this at the next level. A hazard is a term that we use, particularly in safety. As I say, we call it other things in different realms. But in the safety world, it’s a physical situation or it’s a state of a system. And as it says, it often follows from some initiating event which we may call a ‘cause’. And the hazard may lead to an accident. And the key thing to remember is once a hazard exists, an accident is possible, but it’s not certain. You can imagine the sort of cartoon banana skin on the pavement gag. Well, the banana skin is the hazard. In the cartoon, the cartoon character always steps on the banana skin. They always fall over the comic effect. But in the real world, nobody may tread on the banana skin and slip over. There could be nobody there to slip over all the banana skin. Or even if somebody does, they could catch themselves. Or they fall, but it’s on a soft surface and they don’t hurt themselves so there’s no harm.

So, the accident isn’t certain. And in fact, we can have what we call ‘non-accident’ outcomes. We can have harmless consequences. A hazard is an important midway step. I heard it called an accident waiting to happen, which is a helpful definition. An accident waiting to happen, but it doesn’t mean that the accident is inevitable.

Accident

But the accident can happen. Again, the ‘accident’, ‘mishap’, or ‘unintended event’. Something we did not want or a sequence of events that causes harm. And in this case, we’re talking about harm to people. And as I say, it might be a security breach. It might be a financial loss. It might be reputational damage. Something might happen that is very embarrassing for an organisation or an individual. Or again, we could have a hiccup with our project.

Harm

But in this case, we’re talking about harm. And this kind of standard, we’re using what you might call a body count approach to the harm. We’re talking about actual death, physical injury, or damage to the health of people. This standard also considers the damage to property and the environment. Now, very often we are legally required to protect people and the environment from harm. Property less so. But there will be financial implications of losses of property or damage to the systems. We don’t want that. But it’s not always criminally illegal to do that. Whereas usually, hurting people and damaging the environment is. So, this is ‘harm’. We do not want this thing to happen. We do not want this impact. Safety is a much tougher business in this instance. If we have a problem with our project, it’s embarrassing but we could recover it. It’s more difficult to do that when we hurt somebody.

Risk

And always in these terms, we’re talking about ‘risk’. What is ‘risk’? Risk is a combination of two things. It’s a combination of the likelihood of harm or loss and the severity of that harm or loss. It’s those two things together. And we’ve got a very simple illustration here, a little table. And they’re often known as a risk matrix, but don’t worry about that too much. Whatever you want to call it. We’ve got a little two by two table here and we’ve got likelihood in the white text and severity in the black. We can imagine where there’s a risk where we have a low likelihood of a ‘low harm’ or a ‘low impact’ accident or outcome. We say, ‘That’s unlikely to happen and even if it does not much is going to happen.’ It’s going to be a very small impact. So, we’d say that that’s a low risk.

Then at the other end of the spectrum, we can imagine something that has a high likelihood of happening. And that likelihood also has a high impact. Things that happen that we definitely do not want to happen. And we say, ‘That’s a high risk and that’s something that we are very, very concerned about.’

And then in the middle, we could have a combination of an outcome that is quite likely, but it’s of low severity. Or it’s of high severity, but it’s unlikely to happen. And we say, ‘That’s a medium risk’.

Now, this is a very simplified matrix for teaching purposes only. In the real world, you will see matrices that four by four, or five by five, or even six by six, or combinations thereof. And in security where they talk about threat and vulnerability and the outcomes. Here, you might see multiple matrices used. They use multiple matrices to progressively build up a picture of the risk. They use matrices as building blocks. So, it may not be only one matrix used in a more complex thing you’ve got to model. But here we’ve got a nice, simple example. This illustrates what risk is. It’s a combination of severity and likelihood of harm or loss. And that’s what risk is, fundamentally. And if we have a firm grasp of these fundamentals, it’ll help us to reason and deal with almost anything. With enough application.

Accident Sequence

Now, let’s move on and talk about accident sequences. We’re talking about a progression in this case. We’re imagining a left-to-right path. A progression of events that results in an accident. This diagram, that looks like a bow tie, it’s meant to represent the idea that we can have one hazard. There might be many causes that lead to this hazard. There might be many different things that could create the hazard or initiate the hazard. And the hazard may have many different consequences.

As I’ve said before, nothing at all may happen. That might be the consequence of the hazard. Most of the time that’s what’s going to happen. But there may be a variety of consequences. Somebody might get a minor injury or there might be a more serious accident where one or more people are killed. A good example of this is fire. So, the hazard is the fire. The causes might be various. We could be dealing with flammable chemicals, or a lightning strike, or an electricity arc flash. Or we could be dealing with very high temperatures where things spontaneously burst into flames. Or we could have a chemical in the presence of pure oxygen. Some things will spontaneously burst into flames in the presence of pure oxygen. So there’re a variety of causes that lead to the fire.

And the fire might be very small and burn itself out. It causes very little damage and nobody gets hurt. Or it might lead to a much bigger fire that, in theory, could kill lots of people. So, there’s a huge range of consequences potentially from one hazard. But the accident sequence is how we would describe and capture this progression. From initiating events to the hazard to the possible consequences. And by modelling the accident sequence, of course, we can think about how we could interrupt it.

Part 3

We’ve broken risk management down into those six constituent parts. We’ve gone to the next level, in that we’ve sort of gone down to the concepts that underpin these things. These hazards, the accidents, and the accident sequence. We’ve talked about risk itself and what we don’t want to happen. The harm, the loss, the financial loss, the embarrassment, the failed or late or budget project, a security breach, the undesired event, etc. We had an objective which was to do something safely or to complete a project and the risk is that that won’t happen. That there’ll be an impact on what we were trying to do that is negative. That is undesirable.

There are just only more concepts that we need to look at to complete the pattern, as you can see. We’ve been talking about the system. And we’ve been talking about doing things systematically. And then a system works in an operating environment. So, let’s unpack that.

System

First of all, we have a system. The system is going to be a combination of things. I wouldn’t call a pen or a pencil a system. It’s only got a couple of components. You could pull it apart. But it’s too simple to be worth calling it a system. We wouldn’t call it a pen system, would we? So, a system is something more complex. It’s a combination of things and we need to define the boundary. I’ll come back to that.

But within this boundary, we’ve got some different elements in the system that work together. Or they’re used together within a defined operating environment. So, we’re going to expose this system to a range of conditions which it is designed to usually work in. The intention is the system is going to do whatever it does to perform a given task. It can do one defined task or achieve a specific purpose. I talked before about getting in our car. A car is complex enough to be called a system. We get in our car and we drive it on the roads. Or if we’ve got a four-wheel drive, we can drive Off-Road. Or we can use it in a more demanding operating environment to achieve a specific purpose. We want to transport ourselves, and sometimes some stuff, from A to B. That’s what we’re trying to do with the system.

And within that system, we may have personnel/people, we may have procedures. A bunch of rules about how you drive a car legally in different countries. We’ve got materials and physical things – what the car is made of. We could have tools to repair it, change wheels. We’ve got some other equipment, like a satnav. We’ve got facilities. We need to take a car somewhere to fill up with fuel or to recharge it. We’ve got services like garages, repairs, servicing, etc. And there could be some software in there as well. Of course, these days in the car, there’s software everywhere in most complex devices.

So, our system is a combination of lots of different things. These things are working together to achieve some kind of goal or some kind of result. There’s somewhere we want to get to. And it’s designed to work in a particular operating environment. Cars work on roads really well. Off-road cars can work on tracks. Put them in deep water, they tend not to work so well. So, let’s talk about that operating environment.

Operating Environment

What we’ve got here, the total set of all external, natural, and induced conditions. (That’s external to the system, so outside the boundary.) So, it might be these conditions-. It might be natural or it might be generated by something else, which a system is exposed to at any given moment. And we need to get a good understanding of the system, the operating environment, and what we want it to do.

If we have a good understanding of those three things, then we will be well on the way to being able to understand the risks associated with that system. That’s one of the key things with risk management. If you’ve got those three things, that’s crucial. You will not be able to do effective risk management if you don’t have a grasp of those things. And if you do have a thorough grasp of those things, it’s going to help you do effective risk management.

Conclusion

So, we’ve talked about risk management. We’ve broken it down into some big sections. Those six sections; the hazard identification; analysis; risk estimation; evaluation; reduction; and acceptance. We’ve seen how those things depend on only a few concepts. We’ve got the concepts of ‘hazards’, ‘risks’, and ‘accidents’. As well as the undesirable consequences that the risk might result in. And the risk is measured based on the likelihood and severity of that harm or that loss occurring.

And when we’re dealing with a more complex system, we need to understand that system and the environment in which it operates. And of course, we’ve put it in that environment for a purpose. And that unpacking has allowed us to break down quite a big concept, risk management. A lot of people, like myself, spend years and years learning how to do this. It takes time to gain experience because it’s a complex thing. But if we break it down, we can understand what we’re doing. We can work our way down the fundamentals. And then if we’ve got a good grasp of the fundamentals, that supports getting the more complex stuff right. So, that’s what risk management is all about. That’s your risk management 101 and I hope that you find that helpful.

Copyright Statement

I just need to say briefly that those quotations from the standard. I can do that under a Creative Commons licence. The CC4.0. That allows me to do that within limits that I am careful to observe. But this video presentation is copyright the Safety Artisan.

For More…

And you can see more like these at the Safety Artisan website. That’s www.safetyartisan.com. And as you can see, it’s a secure site so you can visit without fear of a security breach. So, do head over there. Subscribe to the monthly newsletter to get discounts on paid videos and regular updates of what’s coming up. both paid and free.

So, it just remains for me to say thanks very much for watching and I look forward to catching up with you again very soon.

End of Risk Management 101

This session can also be found at Udemy.com along with more advanced courses like this one. For more introductory sessions on this site start here.

Categories
Safety Analysis Start Here

SSRAP Module 1 – Risk Basics

Learn the risk basics with The Safety Artisan.

So what is this risk analysis stuff all about? What is ‘risk’? How do you define or describe it? How do you measure it?

In this free session, I explain the basic terms and show how they link together, and how we can break them down to perform risk analysis. I understand risk and that allows me to explain it in simple terms. I’ve used all my 20+ years in the business to help me unpack the jargon and focus on what’s really important.  

You Will Learn to:

  • Describe fundamental risk concepts.
Recap: Risk Basics

Topics: Risk Basics

  • Risk & Mishap;
  • Probability & Severity;
  • Hazard & Causal Factor;
  • Mishap (accident) sequence; and
  • Hazards: Tests & Example

Transcript: Risk Basics

Click here for the Transcript on Risk Basics

Let’s get started with Module One. We’re going to recap on some Risk basics to make sure that we have a common understanding of risk. And that’s important because risk analysis is something that we do every day. Every time you cross the road. Every time you buy something expensive. Every time you decide whether you’re going to travel to something, or look it up online, instead. You’re making risk analysis decisions all the time without even realizing it. But we need something a little bit more formal than the instinctive thinking of our risk that we do all the time. And to help us do that, we need a couple of definitions to get us started.

What is Risk?

First of all, what is Risk? It’s a combination of two things. First, the severity of a mishap or accident. Second, the probability that that mishap will occur. So it’s a combination of severity and probability. We will see that illustrated in the next slide.

We’ll begin by talking about ‘mishap’. Well, what is a mishap? A mishap is an event – or a series of events -resulting in unintentional harm. This harm could be death, injury, occupational illness, damage to or loss of equipment or property, or damage to the environment.

The particular standard we’re looking at today is covering a range of different harms. That’s why we’re focused on safety. And the term ‘mishap’ will also include negative environmental impacts from planned events. So, even if the cause is a deliberate event, we will include that as a mishap.

Probability and Severity

I said that the definition of risk was a combination of probability and severity. Here we got a little illustration of that.

Probability is; how likely is this thing to go wrong? How likely is this thing to happen?

And severity is; How significant is this event? This can vary in seriousness. From death to injury, illness, property damage or equipment loss, damage to the environment, or monetary loss.

And to be honest, we can apply or define risk any way we want. It doesn’t have to be a Safety Risk. We could be thinking about Financial Risk, Reputational Risk, whatever it might be. But what you see there with the little matrix is we measure risk. And whether we say the risk is high, medium or low, or whatever scheme we use. A combination of high severity and high likelihood is going to result in high risk. At the opposite end of the scale, the low probability that a low impact event is going to happen, we would call a low risk.

That’s what we mean by this combination of probability and severity. We put them together and we can measure risk in, to be honest, whatever way we choose to do so. This is a very simple example.

Safety Risks: Hazards

In safety, we have another concept. One that gives us a much finer degree of control over how we’re thinking about risk. We have this concept of hazards. As it says, a hazard is a real or potential condition that could lead to a mishap. It’s not the mishap, it’s a sort of an intermediate stage, as we will see. And the mishap can result in death, injury, property damage, damage to the environment.

Then there’s also this thing called causal factors, or causes. It might be one or several mechanisms that could trigger the hazard. Or they could lead to the hazard, which in turn can lead to a mishap. So the causal factor or the cause can trigger the hazard and then the hazard can lead to the mishap.

(Mishap) Accident Sequence

Here we have an illustration of an accident sequence or a mishap sequence if you prefer. Let’s not get hung up on terminology. So, we may have many causal factors on the left-hand side of this bow tie diagram. Any one of these factors may lead to a particular hazard. A single hazard we’re looking at here. And then that hazard may lead to a range of different consequences.

Not all these consequences are going to be bad. Not all the consequences are going to result in a mishap. There may be lots of consequences where there is no mishap, no accident, no harm whatsoever. There’s going to be a range of possible consequences. What I would like to take away from this diagram is one thought. That thought is ‘Yes, we can have causes leading to a hazard’ – this sort of pinch point in the middle. And from that hazard and number of consequences can arise.

Now that thought is important. It’s a very powerful concept because it helps us to reason about accident sequences. Also, it helps us to do some much more sophisticated work that would otherwise be possible.

Tests for a Hazard

There are three tests, that I know of, for a hazard. The first two are saying the same thing in different ways. We can think of a hazard as being both necessary and enough for harm to occur. We need the hazard to be present before harm can occur, but the hazard is enough for harm to occur. In other words, once the hazard is present, nothing else unusual needs to happen for harm to occur. Once the hazard is there, nothing else needs to go wrong for somebody to get hurt. Normal events can lead to a mishap once the hazard is present. Another helpful way of thinking about it is ‘hazard is an accident waiting to happen’.

Then the third on this list, we can think of a hazard at the point at which we lose control of something. It might be an energy source that we lose control of. It might be something toxic. It might be a physical piece of equipment that we’ve lost control of or a vehicle. It might be a substance. Whatever it might be, we’ve lost control and now somebody could get hurt.

So, those are some tests for a hazard and some different ways of thinking about hazards.

Example of a Hazard

But I always think it’s helpful to have an example. Let’s imagine we’ve got a causal factor. We’ve got some oil that is leaking from its container.

And we can imagine the hazard. The oil has got onto a walkway. Or pavement or a sidewalk or whatever you want to call it. It gets on to an area that human beings would walk on, as the name implies. It’s normal. So once the oil is on the walkway, nothing else unusual needs to happen for there to be an accident. But it doesn’t make the accident inevitable. Because if nobody comes along, there can be no accident. If somebody comes along, but they see the oil and they step over it and avoid it. Or even better, they warn other people about it and tear it up – but that’s another story. But the accident, the mishap is not inevitable.

One of the combinations that is possible is that we get a mishap. A person comes along, doesn’t see the oil, steps on it, slips, and hurts themselves. All these things have to happen in a sequence in this accident sequence for the mishap to occur. For people to get hurt. So there we have a little summary of those risk concepts that we need to get a hold of.

Summary of Module

We’ve covered risk and mishap, probability and severity, hazards, and causal factors. We’ve looked at the mishap or accident sequence, looked at hazards, and at some tests for what makes up a hazard. Including how we tell where the hazard is in the sequence? Where is it between cause, hazard, and consequence, the hazard is? We looked to an example of this in the module.

From this module, we have a common understanding of risk. This will form the foundation for everything that we’re going to do with risk from now on.

This is Module 1 of SSRAP

This is Module 1 from the System Safety Risk Assessment Program (SSRAP) Course. Risk Analysis Programs – Design a System Safety Program for any system in any application. You can access the full course here.

You can find more introductory lessons at Start Here.

Categories
Start Here Work Health and Safety

Introduction to WHS Codes of Practice

In the 30-minute session, we introduce Australian WHS Codes of Practice (CoP). We cover: What they are and how to use them; their Limitations; we List (Federal) codes; provide Further commentary; and Where to get more information. This session is a useful prerequisite to all the other sessions on CoP.

Codes of Practice: Topics

  • What they are and how to use them;
  • Limitations;
  • List of CoP (Federal);
  • Further commentary; and
  • Where to get more information.

Codes of Practice: Transcript

Click Here for the Transcript

Hello and welcome to the Safety Artisan, where you will find professional, pragmatic, and impartial teaching and resources on all thing’s safety. I’m Simon and today is the 16th of August 2020. Welcome to the show.

Introduction

So, today we’re going to be talking about Codes of Practice. In fact, we’re going to be introducing Codes of Practice and the whole concept of what they are and what they do.

Topics for this Session

What we’re going to cover is what Codes of Practice are and how to use them – several slides on that; a brief word on their limitations; a list of federal codes of practice – and I’ll explain why I’m emphasizing it’s the list of federal ones; some further commentary and where to get more information. So, all useful stuff I hope.

CoP are Guidance

So, Codes of Practice come in the work, health and safety hierarchy below the act and regulations. So, at the top you’ve got the WHS Act, then you’ve got the WTS regulations, which the act calls up. And then you’ve got the Codes of Practice, which also the act calls up. We’ll see that in a moment. And what Codes of Practice do are they provide practical guidance on how to achieve the standards of work, health and safety required under the WHS act and regulations, and some effective ways to identify and manage risks. So, they’re guidance but as we’ll see in a moment, they’re much more than guidance. So, as I said, the Codes of Practice are called up by the act and they’re approved and signed off by the relevant minister. So, they are a legislative instrument.

Now, a quick footnote. These words, by the way, are in the introduction to every Code of Practice. There’s a little note here that says we’re required to consider all risks associated with work, not just for those risks that have associated codes of practice. So, we can’t hide behind that. We’ve got to think about everything. There are codes of practice for several things, but not everything. Not by a long way.

…Guidance We Should Follow

Now, there are three reasons why Codes of Practice are a bit more than just guidance. So, first of all, they are admissible in court proceedings. Secondly, they are evidence of what is known about a hazard, risk, risk assessment, risk control. And thirdly, courts may rely, or regulators may rely, on Codes of Practice to determine what is reasonably practicable in the circumstances to which the code applies. So, what’s the significance of that?

So first of all, the issue about being admissible. If you’re unfortunate enough to go to court and be accused of failing under WHS law, then you will be able to appeal to a Code of Practice in your defence and say, “I complied with the Code of Practice”. They are admissible in court proceedings. However, beyond that, all bets are off. It’s the court that decides what is anadmissible defence, and that means lawyers decide, not engineers. Now, given that you’re in court and the incident has already happened a lot of the engineering stuff that we do about predicting the probability of things is no longer relevant. The accident has happened. Somebody has got hurt. All these probability arguments are dust in your in the wake of the accident. So, Codes of Practice are a reliable defence.

Secondly, the bit about evidence of what is known is significant, because when we’re talking about what is reasonably practicable, the definition of reasonably practicable in Section 18 of the WHS act talks about what it is reasonable or what should have been known when people were anticipating the risk and managing it. Now, given that Codes of Practice were published back in 2012, there’s no excuse for not having read them. So, they’re pre –existing, they’re clearly relevant, the law has said that they’re admissible in court. We should have read them, and we should have acted upon them. And there’ll be no wriggling out of that. So, if we haven’t done something that CoP guided us to do, we’re going to look very vulnerable in court.  Or in the whatever court of judgment we’re up against, whether it be public opinion or trial by media or whatever it is.

And thirdly, some CoP can be used to help determine what is SOFARP. So in some circumstances, if you’re dealing with a risk that’s described a CoP, CoP is applicable. Then if you followed everything in CoP, then you might be able to claim that just doing that means that you’ve managed the risk SFARP. Why is that important? Because the only way we are legally allowed to expose people to risk is if we have eliminated or minimized that risk so far as is reasonably practicable, SFARP. That is the key test, the acid test, of “Have we met our risk management obligations? “And CoP are useful, maybe crucial, in two different ways for determining what is SFARP. So yes, they’re guidance but it’s guidance that we ignore at our peril.

Standards & Good Practice

So, moving on. Codes of Practice recognize, and I reemphasize this is in the introduction to every code of practice, they’re not the only way of doing things. There isn’t a CoP for everything under the sun. So, codes recognize that you can achieve compliance with WHS obligations by using another method as long as it provides an equivalent or higher standard of work, health and safety than the code. It’s important to recognize that Codes of Practice are basic. They apply to every business and undertaking in Australia potentially. So, if you’re doing something more sophisticated, then probably CoP on their own are not enough. They’re not good enough.

And in my day job as a consultant, that’s the kind of stuff we do. We do planes, trains and automobiles. We do ships and submarines. We do nuclear. We do infrastructure. We do all kinds of complex stuff for which there are standards and recognized good practice which go way beyond the requirements of basic Codes of Practice. And many I would say, probably most, technical and industry safety standards and practices are more demanding than Codes of Practice. So, if you’re following an industry or technical standard that says “Here’s a risk management process”, then it’s likely that that will be far more detailed than the requirements that are in Codes of Practice.

And just a little note to say that for those of us who love numbers and quantitative safety analysis, what this statement about equivalent or higher standards of health and safety is talking about  –We want requirements that are more demanding and more rigorous or more detailed than CoP. Not that the end –result in the predicted probability of something happening is better than what you would get with CoP because nobody knows what you would get with CoP. That calculation hasn’t been done. So, don’t go down the rabbit hole of thinking “I’ve got a quantitatively demonstrate that what we’re doing is better than CoP.” You haven’t. It’s all about demonstrating the input requirements are more demanding rather than the output because that’s never been done for CoP. So, you’ve got no benchmark to measure against in output terms.

The primacy of WHS & Regulations

A quick point to note that Codes of Practice, they are only guidance. They do refer to relevant WHS act and regulations, the hard obligations, and we should not be relying solely on codes in place of what it says in the WHS Act or the regulations. So, we need to remember that codes are not a substitute for the act or the regs. Rather they are a useful introduction. WHS ACT and regulations are actually surprisingly clear and easy to read. But even so, there are 600 regulations. There are hundreds of sections of the WHS act. It’s a big read and not all of it is going to be relevant to every business, by a long way. So, if you see a CoP that clearly applies to something that you’re doing, start with the cop. It will lead you into the relevant parts of WHS act and regulations. If you don’t know them, have a read around in there around the stuff that – you’ve been given the pointer in the CoP, follow it up.

But also, CoP do represent a minimum level of knowledge that you should have. Again, start with CoP, don’t stop with them. So, go on a bit. Look at the authoritative information in the act and the regs and then see if there’s anything else that you need to do or need to consider. The CoP will get you started.

And then finally, it’s a reference for determining SOFARP. You won’t see anything other than the definition of reasonably practicable in the Act. You won’t see any practical guidance in the Act or the regulations on how to achieve SOFARP. Whereas CoP does give you a narrative that you can follow and understand and maybe even paraphrase if you need to in some safety documentation. So, they are useful for that. There’s also guidance on reasonably practicable, but we’ll come to that at the end.

Detailed Requirements

It’s worth mentioning that there are some detailed requirements in codes. Now, when I did this, I think I was looking at the risk management Code of Practice, which will go through later in another session. But in this example, there are this many requirements. So, every CoP has the statement “The words ‘must’, ‘requires’, or ‘mandatory’ indicate a legal requirement exists that must be complied with.” So, if you see ‘must’, ‘requires’, or ‘mandatory’, you’ve got to do it. And in this example CoP that I was looking at, there are 35 ‘must’s, 39 ‘required’ or ‘requirement’ – that kind of wording – and three instances of ‘mandatory’. Now, bearing in mind the sentence that introduces those things contains two instances of ‘must’ and one of ‘requires’ and one of ‘mandatory’. So, straight away you can ignore those four instances. But clearly, there are lots of instances here of ‘must’ and ‘require’ and a couple of ‘mandatory’.

Then we’ve got the word ‘should’ is used in this code to indicate a recommended course of action, while ‘may’ is used to indicate an optional course of action. So, the way I would suggest interpreting that and this is just my personal opinion – I have never seen any good guidance on this. If it says ‘recommended’, then personally I would do it unless I can justify there’s a good reason for not doing it. And if it said ‘optional’, then I would consider it. But I might discard it if I felt it wasn’t helpful or I felt there was a better way to do it. So, that would be my personal interpretation of how to approach those words. So, ‘recommended’ – do it unless you can justify not doing it. ‘Optional’ – Consider it, but you don’t have to do it.

And in this particular one, we’ve got 43 instances of ‘should’ and 82 of ‘may’. So, there’s a lot of detailed information in each CoP in order to consider. So, read them carefully and comply with them where you have to work and that will repay you. So, a positive way to look at it, CoP are there to help you. They’re there to make life easy for you. Read them, follow them. The negative way to look at them is, ”I don’t need to do all this says in CoP because it’s only guidance”. You can have that attitude if you want. If you’re in the dock or in the witness box in court, that’s not going to be a good look. Let’s move on.

Limitations of CoP

So, I’ve talked CoP up quite a lot; as you can tell, I’m a fan because I like anything that helps us do the job, but they do have limitations. I’ve said before that there’s a limited number of them and they’re pretty basic. First of all, it’s worth noting that there are two really generic Codes of Practice. First of all, there’s the one on risk management. And then secondly, there’s the one on communication, consultation and cooperation. And I’ll be doing sessions on both of those. Now, those apply to pretty much everything we do in the safety world. So, it’s essential that you read them no matter what you’re doing and comply with them where you have to.

Then there are other codes of practice that apply to specific activities or hazards, and some of them are very, very specific, like getting rid of asbestos, or welding, or spray painting – or whatever it might be – shock blasting. Those have clearly got a very narrow focus. So, you will know if you’re doing that stuff. So, if you are doing welding and clearly you need to read the welding CoP. If welding isn’t part of your business or undertaking, you can forget it.

However, overall, there are less than 25 Codes of Practice. I can’t be more precise for reasons that we will come to in a moment. So, there’s a relatively small number of CoP and they don’t cover complex things. They’re not going to help you design a super –duper widget or some software or anything like that. It’s not going to help you do anything complicated. Also, Codes of Practice tend to focus on the workplace, which is understandable. They’re not much help when it comes to design trade –offs. They’re great for the sort of foundational stuff. Yes, we have to do all of this stuff regardless. When you get to questions of, “How much is enough?” Sometimes in safety, we say, “How much margin do I need?” “How many layers of protection do I need?” “Have I done enough?” CoP aren’t going to be a lot of use helping you with that kind of determination but you do need to have made sure you’ve done everything CoP first and then start thinking about those trade –offs, would be my advice. You’re less likely to go wrong that way. So, start with your firm basis of what you have to do to comply and then think “What else could I do?”

List of CoP (Federal) #1

Now for information, you’ve got three slides here where we’ve got a list of the Codes of Practice that apply at the federal or Commonwealth level of government in Australia. So, at the top highlighted I’ve already mentioned the ‘how’ to manage WHS risks and the consultation, cooperation, and coordination codes. Then we get into stuff like abrasive, blasting, confined spaces, construction and demolition and excavation, first aid. So, quite a range of stuff, covered.

List of CoP (Federal) #2

Hazardous manual tasks – so basically human beings carrying and moving stuff. Managing and controlling asbestos, and removing it. Then we’ve got a couple on hazardous chemicals on this page, electrical risks, managing noise, preventing hearing loss, and stevedoring. There you go. So, if you’re into stevedoring, then this CoP is for you. The highlighted ones we’re going to cover in later sessions.

List of CoP (Federal) #3

Then we’ve got managing risk of Plant in the workplace. There was going to be a Code of Practice for the design of Plant, but that never saw the light of day so we’ve only got guidance on that. We’ve got falls, environment, work environment, and facilities. We’ve got another one on safety data sheets for another one on hazardous chemicals, preventing falls in housing – I guess because that’s very common accident – safe design of structures, spray painting and powder coating, and welding processes. So, those are the list of – I think it’s 24 – Codes of Practice are applied by Comcare, the federal regulator.

Commentary #1

Now, I’m being explicit about which regulator and which set of CoP, because they vary around Australia. Basically, the background was the model Codes of Practice were developed by Safe Work Australia, which is a national body. But those model Codes of Practice do not apply. Safe Work Australia is not a regulator. Codes of Practice are implemented or enforced by the federal government and by most states and territories. And it says with variations for a reason. Not all states and territories impose all codes of practice. For example, I live in South Australia and if you go and look at the WorkSafe South Australia website or Safe Work – whatever it’s called – you will see that there’s a couple of CoP that for some reason we don’t enforce in South Australia. Why? I do not know. But you do need to think about these things depending on where you’re operating.

It’s also worth saying that WHS is not implemented in every state in Australia. Western Australia currently have plans to implement WHS, but as of 2020 but I don’t believe they’ve done so yet. Hopefully, it’s coming soon. And Victoria, for some unknown reason, have decided they’re just not going to play ball with everybody else. They’ve got no plans to implement WHS that I can find online. They’re still using their old OHS legislation. It’s not a universal picture in Australia, thanks to our rather silly version of government that we have here in Australia – forget I said that. So, if it’s a Commonwealth workplace and we apply the federal version of WHS and Codes of Practice. Otherwise, we use state or territory versions and you need to see the local regulator’s Web page to find out what is applied where. And the definition of a Commonwealth workplace is in the WHS Act, but also go and have a look at the Comcare website to see who Comcare police. Because there are some nationalised industries that count as a Commonwealth workplace and it can get a bit messy.

So, sometimes you may have to ask for advice from the regulator but go and see what they say. Don’t rely on what consultants say or what you’ve heard on the grapevine. Go and see what the regulator actually says and make sure it’s the right regulator for where you’re operating.

Commentary #2

What’s to come? I’m going to do a session on the Risk Management Code of Practice, and I’m also, associated with that, going to do a session on the guidance on what is reasonably practicable. Now that’s guidance, it’s not a Code of Practice. But again, it’s been published so we need to be aware of it and it’s also very simple and very helpful. I would strongly recommend looking at that guidance if you’re struggling with SFARP for what it means, it’s very good. I’ll be talking about that soon. Also, I’m going to do a session on tolerability of risk, because you remember when I said “CoP aren’t much good for helping you do trade–offs in design” and that kind of thing. They’re really only good for simple stuff and compliance. Well, what you need to understand to deal with the more sophisticated problems is the concept of tolerability of risk. That’ll help us do those things. So, I’m going to do a session on that.

I’m also going to do a session on consultation, cooperation, and coordination, because, as I said before, that’s universally applicable. If we’re doing anything at a workplace, or with stuff that’s going to a workplace, that we need to be aware of what’s in that code. And then I’m also going to do sessions on plant, structures and substances (or hazardous chemicals) because those are the absolute bread and butter of the WHS Act. If you look at the duties of designers, manufacturers, importers, suppliers, and installers, et cetera, you will find requirements on plant, substances and structures all the way through those clauses in the WHS Act. Those three things are key so we’re going to be talking about that.

Now, I mentioned before that there was going to be a Code of Practice on plant design, but it never made it. It’s just guidance. So, we’ll have a look at that if we can as well – Copyright permitting. And then I want to look at electrical risks because I think the electrical risks code is very useful. Both for electrical risks, but it’s also a useful teaching vehicle for designers and manufacturers to understand their obligations, especially if you operate abroad and you want to know, or if you’re importing stuff “Well, how do I know that my kit can be safely used in Australia?” So, if you can’t do the things that the electrical risk CoP requires in the workplace if your piece of kit won’t support that, then it’s going to be difficult for your customers to comply. So, probably there’s a hint there that if you want to sell your stuff successfully, here’s what you need to be aware of. And then that applies not just to electrical, I think it’s a good vehicle for understanding how CoP can help us with our upstream obligations, even though CoP applies to a workplace. That session will really be about the imaginative use of Code of Practice in order to help designers and manufacturers, etc.

And then I want to also talk about noise Code of Practice, because noise brings in the concept of exposure standards. Now, generally, Codes of Practice don’t quote many standards. They’re certainly not mandatory, but noise is one of those areas where you have to have standards to say, “this is how we’re going to measure the noise”. This is the exposure standard. So, you’re not allowed to expose people to more than this. That brings in some very important concepts about health monitoring and exposure to certain things. Again, it’ll be useful if you’re managing noise but I think that session will be useful to anybody who wants to understand how exposure standards work and the requirements for monitoring exposure of workers to certain things. Not just noise, but chemicals as well. We will be covering a lot of that in the session(s) on HAZCHEM.

Copyright & Attribution

I just want to mention that everything in quotes/in italics is downloaded from the Federal Register of Legislation, and I’ve gone to the federal legislation because I’m allowed to reproduce it under the license, under which it’s published. So, the middle paragraph there – I’m required to point that out that I sourced it from the Federal Register of legislation, the website on that date. And for the latest information, you should always go to the website to double–check that the version that you’re looking at is still in force and is still relevant. And then for more information on the terms of the license, you can go and see my page at the www.SafetyArtisan.com because I go through everything that’s required and you can check for yourself in detail.

For More…

Also, on the website, there’s a lot more lessons and resources, some of them free, some of them you have to pay to access, but they’re all there at www.safetyartisan.com. Also, there’s the Safety Artisan page at www.patreon.com/SafetyArtisan where you will see the paid videos. And also, I’ve got a channel on YouTube where the free videos are all there. So, please go to the Safety Artisan channel on YouTube and subscribe and you will automatically get a notification when a new free video pops up.

End

And that brings me to the end of the presentation, so thanks very much for listening. I’m just going to stop sharing that now. It just remains for me to say thank you very much for tuning in and I look forward to sharing some more useful information on Codes of Practice with you in the next session in about a month’s time. Cheers now, everybody. Goodbye.

There’s more!

You can find the Model WHS Codes of Practice here. Back to the Topics Page.

Categories
Start Here Work Health and Safety

Lessons Learned from a Fatal Accident

Lessons Learned: in this 30-minute video, we learn lessons from an accident in 2016 that killed four people on the Thunder River Rapids Ride in Queensland. The coroner’s report was issued this year, and we go through the summary of that report. In it we find failings in WHS Duties, Due Diligence, risk management, and failures to eliminate or minimize risks So Far As is Reasonably Practicable (SFARP). We do not ‘name and shame’, rather we focus on where we can find guidance to do better.

In 2016, four people died on the Thunder River Rapids Ride.

Lessons Learned: Key Points

We examine multiple failings in:

  • WHS Duties;
  • WHS Due Diligence;
  • Risk management; and
  • Eliminating or minimizing risks So Far As is Reasonably Practicable (SFARP).

Transcript: Lessons Learned from a Theme Park Tragedy

Click here for the Video Transcript

Introduction

Hello, everyone, and welcome to the Safety Artisan: purveyors of fine safety engineering training videos and other resources. I’m Simon and I’m your host and today we’re going to be doing something slightly different. So, there’re no PowerPoint slides. Instead, I’m going to be reading from a coroner’s report from a well-known accident here in Australia and we’re going to be learning some lessons in the context of WHS workplace health and safety law.

Disclaimer

Now, I’d just like to reassure you before we start that I won’t be mentioning the names of the deceased. I won’t be sharing any images of them. And I’m not even going to mention the firm that owned the theme park because this is not about bashing people when they’re down. It’s about us as a community learning lessons when things go wrong in order to fix the problem, not the blame. So that’s what I’d like to emphasize here.

The Coroner’s Report

So, I’m just turning to the summary of the coroner’s report. Basically, the coroner was examining the deaths of four people back in 2016 on what was called the Thunder River Rapids Ride. Or TRRR or TR3 for short because it’s a bit of a mouthful. This was a water ride, as the name implies, and what went wrong was the water level dropped. Rafts, these circular rafts that went down the rapids, went down the chute, got stuck. Another raft came up behind the stuck raft and went into it. One of the rafts tipped over.

These rafts seat six people in a circular configuration. You may have seen them. They’re in – different versions of this ride are in lots of theme parks.

But out of the six, unfortunately, the only two escaped and four people were killed, tragically. So that’s the background. That happened in October 2016, I think it was. The coroner’s report came out a few months ago, and I’ve been wanting to talk about it for some time because it really does illustrate very well a number of issues where WHS can help us do the right thing.

WHS duties

So, first of all, I’m looking at the first paragraph in the summary, the coroner starts off; the design and construction of the TRRR at the conveyor and unload area posed a significant risk to the health and safety of patrons. Notice that the coroner says the design and construction. Most people think that WHS only applies to workplaces and people managing workplaces, but it does a lot more than that. Sections 22 through 26 of the Act talk about the duties of designers, manufacturers, importers, suppliers and then people who commissioned, install, et cetera.

So, WHS supplies duties on a wide range of businesses and undertakings and designers and constructors are key. Now, it’s worth noting that there was no importer here. The theme park, although the TRRR ride was similar to a ride available commercially elsewhere, for some reason, they chose to design and build their own version in Queensland. Don’t know why. Anyway, that doesn’t really matter now. So, there was no importer, but otherwise, even if you didn’t design and construct the thing, if you imported it, the same duties still apply to you.

No effective risk assessment

So, the coroner then goes on to talk about risks and hazards and says each of these obvious hazards posed a risk to the safety of patrons on the ride and would have been easily identifiable to a competent person had one ever been commissioned to conduct a risk and hazard assessment of the ride. So, what the coroner is saying there is, “No effective risk assessment has been done”. Now, that is clearly contrary to the risk management code of practice under WHS and also, of course, that the definition of SFARP, so far as reasonably practicable, basically is a risk assessment or risk management process. So, if you’ve not done effective risk management, you can’t say that you’ve eliminated or minimized risks SFARP, which is another legal requirement. So, a double whammy there.

Then moving on. “Had noticed been taken of lessons learned from the preceding incidents, which were all of a very similar nature …” and then he goes on. Basically, that’s the back end of a sentence where he says, you didn’t do this, you had incidents on the ride, which are very similar in the past, and you didn’t learn from them. And again, with respect to reducing risks SFARP, Section 18 in the WHS Act, which talks about the definition of reasonably practicable, which is the core of SFARP, talks about what ought to have been known at the time. So, when you’re doing a risk assessment or maybe you’re reassessing risk after a modification and this ride was heavily modified several times or after an incident, you need to take account of the available information. And the owners of TRRR the operators clearly didn’t do that. So, another big failing.

The coroner goes on to note that records available with respect to the modifications to the ride are scant and ad hoc. And again, there’s a section in the WHS risk management code of practice about keeping records. It’s not that onerous. I mean, the COP is pretty simple but they didn’t meet the requirement of the code of practice. So, bad news again.

due diligence

And then finally, I’ve got to the bottom of page one. So, the coroner then notes the maintenance tasks undertaken on the ride whilst done so regularly and diligently by the staff, seemed to have been based upon historical checklists which were rarely reviewed despite the age of the device or changes to the applicable Australian standards.

Now, this is interesting. So, this is contravening a different section of the WHS Act. In Section 27, it talks about the duties of officers and effectively that sort of company directors, senior managers. Officers are supposed to exercise due diligence. In the act, due diligence is fairly simple- It’s six bullet points, but one of them is that the officers have to sort of keep up to date on what’s going on in their operation. They have to provide up to date and effective safety information for their staff. They’re also supposed to keep up with what’s going on in safety regulation that’s applicable to their operation. So, I reckon in that one statement from the coroner then there’s probably three breaches of due diligence there to start with.

risk controls lacking

We’ve reached the bottom of page one- Let’s carry on. The coroner then goes on to talk about risk controls that were or were not present and says, “in accordance with the hierarchy of controls, plant and engineering measures should have been considered as solutions to identified hazards”. So in WHS regulations and it’s repeated in the risk code of practice, there’s a thing called the hierarchy of controls. Basically, it says that some types of risk controls are more effective than others and therefore they come at the top of the list, whereas others are less effective and should be considered last.

So, top of the list is, “Can you eliminate the hazard?” If not, can you substitute the hazardous thing for something else that’s less hazardous- or with something else that is less hazardous, I should say? Can you put in engineering solutions or controls to control hazard? And then finally, at the bottom of my list is admin procedures for people to follow and then personal protective equipment for workers, for example. We’ll talk about this more later, but the top end of the hierarchy had just not been considered or not effectively anyway.

a predictable risk

So, the coroner then goes on to say, “raft’s coming together on the ride was a well-known risk, highlighted by the incident in 2001 and again in 2004”. Now actually it says 2004, I think that might be a typo. Elsewhere, it says 2014, but certainly, there were two significant incidents that were similar to the accident that actually killed four people. And it was acknowledged that various corrective measures could be undertaken to, quote, “adequately control the risk of raft collision”. However, a number of these suggestions were not implemented on the ride.

Now, given that they’ve demonstrated the ability to kill multiple people on the ride with a raft collision, it’s going to be a very, very difficult thing to justify not implementing controls. So, given the seriousness of the potential risk, to say that a control is feasible is practicable, but then to say “We’re not going to do it. It’s not reasonable”. That’s going to be very, very difficult to argue and I would suggest it’s almost a certainty that not all reasonably practicable controls were implemented, which means the risk is not SFARP, which is a legal requirement.

Further on, we come back to document management, which was poor with no formal risk register in place. So, no evidence of a proper risk assessment. Members of the department did not conduct any holistic risk assessments of rides with the general view that another department was responsible. So, the fact that risk assessment wasn’t done- That’s a failing. The fact that senior management didn’t knock heads together and say “This has to be done. Make it happen”- That’s also another failing. That’s a failing of due diligence, I suspect. So, we’ve got a couple more problems there.

high-risk plant

Then, later on, the coroner talks about necessary engineering oversight of high-risk plant not being done. Now, under WHS act definitions, amusement rides are counted as high-risk plant, presumably because of the number of serious accidents that have happened with them over the years. The managers of the TRRR didn’t meet their obligations with respect to high-risk plants. So, there are some things that are optional for common garden stuff is mandatory for high-risk plants and those obligations were not met it seems.

And then in just the next paragraph, we reinforce this due diligence issue. Only a scant amount of knowledge was held by those in management positions, including the general manager of engineering, as to the design modifications and passed notable incidents on the ride. One of the requirements of due diligence is that senior management must have a knowledge of their operations, a knowledge of the hazards and risks associated with the operations. So for the engineering manager to be ignorant about modifications and risks associated with the ride, I think is a clear failure of due diligence.

Still talking about engineering, the coroner notes “it is significant that the general manager had no knowledge of past incidents involving rafts coming together on the ride”. Again, due diligence. If things have happened those need to be investigated and learned from and then you need to apply fresh controls if that’s required. And again, this is a requirement. So, this shows a lack of due diligence. It’s also a requirement in the risk management code of practice to look at things when new knowledge is gained. So, a couple more failures there.

no water-level detection, alarm or emergency stop

Now, it said that the operators of the ride were well aware that when one pump failed, and there were two, the ride was no longer able to operate with the water level dropping dramatically, stranding the rafts on the steel support railings. And of course, that’s how the accident happened.

Regardless, there was no formal means by which to monitor the water level of the ride or audible alarm to advise one of the pumps had ceased to operate. So, a water level monitor? Well, we’re talking potentially about a float, which is a pretty simple thing. There’s one in every cistern, in every toilet in Australia. Maybe the one for the ride would have to be a bit more sophisticated than that- A bit industrial grade but basically the same principle.

And no alarm to advise the operators that this pump had failed, even though it was known that this would have a serious effect on the operation of the ride. So, there’re multiple problems here. I suspect you’ll be able to find regulations that require these things. Certainly, if you looked at the code of practice on plant design because this counts as industrial plants, it’s a high-risk plant, so you would expect very high standards of engineering controls on high-risk plants and these were missing. More on that later.

In a similar vein, the coroner says “a basic automated detection system for the water level would have been inexpensive and may have prevented the incident from occurring”. So basically, the coroner is saying this control mechanism would have been cheap so it’s certainly reasonably practicable. If you’ve got a cheap control that will prevent a serious injury or a death, then how on earth are you going to argue that it’s not reasonable to implement it? The onus is on us to implement all reasonably practical controls.

And then similarly, the lack of a single emergency stop on the ride, which was capable of initiating a complete shutdown of all the mechanisms, was also inadequate. And that’s another requirement from the code of practice on plant design, which refers back to WHS regulations. So, another breach there.

human factors

We then move on to a section where it talks about operators, operators’ account of the incident, and other human factors. I’m probably going to ask my friend Peter Bender, who is a Human Factors specialist, to come and do a session on this and look at this in some more detail, because there are rich pickings in this section and I’m just going to skim the surface here because we haven’t got time to do more. And the coroner says “it’s clear that these 38 signals and checks to be undertaken by the ride operators were excessive, particularly given that the failure to carry out anyone could potentially be a factor which would contribute to a serious incident”. So clearly, 38 signals and checks distributed between two ride operators, because there was no one operator in control of the whole ride- that’s a human factors nightmare for a start- but clearly, the work designed for the ride was poor. There is good guidance available from Safe Work Australia on good work design so there’s really no excuse for this kind of lapse.

And then the coroner goes on to say, reinforcing this point about the ride couldn’t be safely controlled by a human operator. The lack of engineering controls on a ride of this nature is unjustifiable. Again, reinforcing the point that risk was not SFARP because not all reasonably practicable controls had been implemented. Particularly controls at the higher end of the hierarchy of controls. So, a serious failing there.  

(Now, I’ve got something that I’m going to skip, actually, but – It’s a heck of a comment, but it’s not actually relevant to WHS.)

training and competence

We’re moving on to training and competence. Those responsible for managing the ride whilst following the process and procedure in place – and I’m glad to see you from a human practice point of view that the coroner is not just trying to blame the last person that touched it. He’s making a point of saying the operators did all the right stuff. Nevertheless, they were largely not qualified to perform the work for which they were charged.

The process and procedures that they were following seemed to have been created by unknown persons. Because of the poor record-keeping, presumably who it is safe to assume lacked the necessary expertise. And I think the coroner is making a reasonable assumption there, given the multiple failings that we’ve seen are in risk management, in due diligence, in record-keeping, in the knowledge of key people, et cetera, et cetera.

It seems that the practice at the park was simply to accept what had always been done in terms of policy and procedure. And despite changes to safety standards and practices happening over time, because this is an old ride, only limited and largely reactionary consideration was ever given to making changes, including training, providing to staff. So, reactionary -bad word. We’re supposed to predict risk and prevent harm happening. So, multiple failures on due diligence here and on staff training, providing adequate staff training, providing adequate procedures, et cetera.

The coroner goes on to say, “regardless of the training provided at the park, it would never have been sufficient to overcome the poor design of the ride. The lack of automation and engineering controls”. So, again, the hierarchy of controls was not applied, relatively cheap, engineering controls not used, placing an undue burden on the operator. Sadly, this is all too common and in many applications. This is one of the reasons they are not naming the ride operators or trying to shame them because I’ve seen this happen in so many different places. It wouldn’t be fair to single these people out.

‘incident free’ operations?

Now we have a curious, a curious little statement in paragraph 1040. The coroner says “submissions are made that there was a 30-year history of incident-free operation of the ride”. So, what it looks like is that the ride operators, management, trying to tell the coroner that they never had an incident on the ride in 30 years, which sounds pretty impressive, doesn’t it, at face value. But of course, the coroner already knew or discovered later on that there had been incidents on the ride. In fact, there have been two incidents that were very similar to the fatal accident.

Now, on the surface, this looks bad, doesn’t it? It looks like the ride management were trying to mislead the coroner. I don’t actually think that’s the case because I’ve seen that many organizations do poor incident reporting, poor incident recording, and poor learning from experience from incidents that it doesn’t surprise me that the senior management were not aware of incidents on their ride. Unfortunately, it’s partly human nature. Nobody likes to dwell on their failures or think about nasty things happening, and nobody likes to go to the boss saying we need to shut down a moneymaking ride. Don’t forget, this was a very popular ride. We need to shut down a moneymaking ride in order to spend more money-making modifications to make it safer. And then management turns around and say, “Well, nobody’s been hurt. So, what’s the problem?”

And again, I’ve seen this attitude again and again, even on people operating much more sophisticated and much more dangerous equipment than this. So, whilst this really does look bad- the optics are not good, as they like to say. I don’t think there’s actually a conspiracy going on here. I think it’s just stupid mistakes because it’s so common. Moving on.

standards

Now the coroner goes on to talk about standards not being followed, particularly when standards get updated over time. Bearing in mind this ride was 30 years old. The coroner states “it is essential that any difference in these standards are recognized and steps taken to ensure any shortfalls with a device manufactured internationally is managed”. Now, this is a little bit of an aside, because as I’ve mentioned before, the TRRR was actually designed and manufactured in Australia. Albeit not to any standards that we would recognize these days. But most rides were not and this highlights duties of importers. So, if you import something from abroad, you need to make sure that it complies with Australian requirements. That’s a requirement, that’s a duty under WHS law. We’ll come back to this in just a moment.

(We’ll skip that [comment] because we’ve done training and competency to death.)

the role of the regulator

So, following on about the international standards, the coroner also has a crack at the Queensland regulator, who I won’t name, and says “the regulator draws my attention to the difficulties arising when we’re requiring all amusement devices to comply with Australian standards. This difficulty is brought about by the fact that most amusement devices are designed and manufactured overseas, predominantly based on European standards”. Now, in the rest of the report, the coroner has a good old crack at the regulator. (If you’re Irish, a crack means a bit of fun. I’m not talking about a bit of fun.)

The coroner sticks the boot into the regulator for being pretty useless. And sadly, that’s no surprise in Australia. So basically, the regulator said, “Oh, it’s all too difficult!” And you think, “Well, it’s your job, actually, so why haven’t you done it properly?”

But being a little bit more practical, if you work in an industry where a lot of stuff is imported and let’s face it, that’s pretty common in Australia, you’ve got two choices. You can either try and change Australian standards so that they align better to the standards of the kit where you’re getting the stuff from in your industry, or maybe the regulators say could say, “Okay, this is a common problem across the industry. We will provide some guidance that tells you how to make that transition from the international standards to Australian standards and what we as the regulator consider acceptable and not acceptable”. And then that really helps the industry to do the right thing and to be consistent in terms of operation and enforcement.

So, the regulator is letting the people who they regulate know this is the standard that is required of you, this is what you have to do. And that’s really the job of a good regulator. So, the fact that the regulator in this particular case just hadn’t bothered to do so over a period of some decades, it would seem, doesn’t really say a lot for the professionalism of the regulator. And I’m not surprised that the coroners decided to have a go at them.

Summary

So, we’ve been through just over 20 comments, I think. I mean, I actually had 24/25 in total, but I skipped a few because they were a bit repetitive and it’s interesting to note that there were two major comments on failure to conduct designer duties and that kind of thing. Seven on risk management, four on SFARP, although of course, all the risk management ones also affects SFARP, and five on due diligence. So, there’re almost 20 significant breaches there and I wasn’t even really trying to pick up everything the coroner said. And bearing in mind, I was only reading from the summary. I didn’t bother reading the whole report because it’s pages and pages and pages.

And the lesson that we can draw from all of this friends, is not to bash the people who make mistakes, but to learn lessons for ourselves. How could we do better? And I think the lesson is everything that we need to do has been clearly set out in the WHS Act, in the WHS regulations. Then there’re codes of practice that give us guidance in particular areas and our general responsibilities and these codes of practice also guide us on to what could should be considered, SFARP, for certain hazards and risks. Then there’s also some fantastic guidance, documentation and information available from Safe Work Australia. On, for example, human factors and good work design and so on and so forth.

So, there’s lots of really good, really readable information out there and it’s all free. It’s all available on that wonderful thing we call the Internet. So, there really is no excuse for making basic mistakes like this and killing people. It’s not that difficult. And a lot of the safety requirements are not that onerous. You don’t have to be a rocket scientist to read them and understand them. A lot of the requirements are basic, structured, common sense. So, the lesson from this awful accident is it doesn’t have to be this way. We can do much better than that quite easily and if we don’t and something goes wrong, then the law will be after us.

looking ahead

It will be interesting to see – I believe that the WorkSafe Queensland are now investigating to see whether they’re going to bring any prosecutions. It should be said that the police investigated and didn’t bring any prosecutions against individuals. I don’t know if Queensland has a corporate manslaughter act. I wouldn’t think so based on the fact that they’ve not prosecuted anybody, but you don’t need to find an individual guilty of gross negligence, manslaughter for four WHS to take effect. So, I suspect that in due course, we will see the operators of the theme park probably cop a significant fine and maybe some of their directors and senior managers will be going to jail. That’s how serious these and how numerous these breaches are. You really don’t need to dig very deep to see what’s gone wrong and to see the legal obligations have not been met.

Since this video was recorded the TRRR owners have been charged with three offences under WHS law. They pleaded guilty and were fined $4.5M.

End of Lessons Learned

Back to the ‘Work Health & Safety‘ and ‘Start Here‘ Topics Pages.

Categories
Start Here System Safety

Safety Concepts Part 2

In this 33-minute session, Safety Concepts Part 2, The Safety Artisan equips you with more Safety Concepts. We look at the basic concepts of safety, risk, and hazard in order to understand how to assess and manage them. Exploring these fundamental topics provides the foundations for all other safety topics, but it doesn’t have to be complex. The basics are simple, but they need to be thoroughly understood and practiced consistently to achieve success. This video explains the issues and discusses how to achieve that success.

This is the three-minute demo of the full (33 minute) Safety Concepts, Part 2 video.

Safety Concepts Part 2: Topics

  • Risk & Harm;
  • Accident & Accident Sequence;
  • (Cause), Hazard, Consequence & Mitigation;
  • Requirements / Essence of System Safety;
  • Hazard Identification & Analysis;
  • Risk Reduction / Estimation;
  • Risk Evaluation & Acceptance;
  • Risk Management & Safety Management; and
  • Safety Case & Report.

Safety Concepts Part 2: Transcript

Click Here for the Transcript

Hi everyone, and welcome to the safety artisan where you will find professional, pragmatic, and impartial advice on safety. I’m Simon, and welcome to the show today, which is recorded on the 23rd of September 2019. Today we’re going to talk about system safety concepts. A couple of days ago I recorded a short presentation (Part 1) on this, which is also on YouTube.  Today we are going to talk about the same concepts but in much more depth.

In the short session, we took some time picking apart the definition of ‘safe’. I’m not going to duplicate that here, so please feel free to go have a look. We said that to demonstrate that something was safe, we had to show that risk had been reduced to a level that is acceptable in whatever jurisdiction we’re working in.

And in this definition, there are a couple of tests that are appropriate that the U.K., but perhaps not elsewhere. We also must meet safety requirements. And we must define Scope and bound the system that we’re talking about a Physical system or an intangible system like a computer program. We must define what we’re doing with it what it’s being used for. And within which operating environment within which context is being used.  And if we could do all those things, then we can objectively say – or claim – that the system is safe.

Topics

We’re going to talk about a lot more Topics. We’re going to talk about risk accidents. The cause has a consequence sequence. They talk about requirements and. Spoiler alert. What I consider to be the essence of system safety. And then we’ll get into talking about the process. Of demonstrating safety, hazard identification, and analysis.

Risk Reduction and estimation. Risk Evaluation. And acceptance. And then pulling it all together. Risk management safety management. And finally, reporting, making an argument that the system is safe supporting with evidence. And summarizing all of that in a written report. This is what we do, albeit in different ways and calling it different things.

Risk

Onto the first topic. Risk and harm.  Our concept of risk. It’s a combination of the likelihood and severity of harm. Generally, we’re talking about harm. To people. Death. Injury. Damage to help. Now we might also choose to consider any damage to property in the environment. That’s all good. But I’m going to concentrate on. Harm. To people. Because. Usually. That’s what we’re required to do. By the law. And there are other laws covering the environment and property sometimes. That. We’re not going to talk.  just to illustrate this point. This risk is a combination of Severity and likelihood.

We’ve got a very crude. Risk table here. With a likelihood along the top. And severity. Downside. And we might. See that by looking at the table if we have a high likelihood and high severity. Well, that’s a high risk. Whereas if we have Low Likelihood and low severity. We might say that’s a low risk. And then. In between, a combination of high and low we might say that’s medium. Now, this is a very crude and simple example. Deliberately.

You will see risk matrices like this. In. Loads of different standards. And you may be required to define your own for a specific system, there are lots of variations on this but they’re all basically. Doing this thing and we’re illustrating. How we determine the level of risk. By that combination of severity. And likely, I think a picture is worth a thousand words. Moving online to the accident. We’re talking about (in this standard) an unintended event that causes harm.

Accidents, Sequences and Consequences

Not all jurisdictions just consider accidental event some consider deliberate as well. We’ll leave that out. A good example of that is work health and safety in Australia but no doubt we’ll get to that in another video sometime. And the accident sequences the progression of events. That results in an accident that leads to an. Now we’re going to illustrate the accident sequence in a moment but before we get there. We need to think about cousins.  here we’ve got a hazard physical situation of state system. Often following some initiating event that may lead to an accident, a thing that may cause harm.

And then allied with that we have the idea of consequences. Of outcomes or an outcome. Resulting from. An. Event. Now that all sounds a bit woolly doesn’t it, let’s illustrate that. Hopefully, this will make it a lot clearer. Now. I’ve got a sequence here. We have. Causes. That might lead to a hazard. And the hazard might lead to different consequences. And that’s the accident. See. Now in this standard, they didn’t explicitly define causes.

Cause, Hazard and Consequence

They’re just called events. But most mostly we will deal with causes and consequences in system safety. And it’s probably just easier to implement it. Whether or not you choose to explicitly address every cause. That’s often option step. But this is the accident Sequence that we’re looking at. And they this sort of funnels are meant to illustrate the fact that they may be many causes for one hazard. And one has it may lead to many consequences on some of those consequences. Maybe. No harm at all.

We may not actually have an accident. We may get away with it. We may have a. Hazard. And. Know no harm may befall a human. And if we take all of this together that’s the accident sequence. Now it’s worth. Reiterating. That just because a hazard exists it does not necessarily need. Lead to harm. But. To get to harm. We must have a hazard; a hazard is both necessary and sufficient. To lead to harmful consequences. OK.

Hazards: an Example

And you can think of a hazard as an accident waiting to happen. You can think of it in lots of different ways, let’s think about an example, the hazard might be. Somebody slips. Okay well while walking and all. That slip might be caused by many things it might be a wet surface. Let’s say it’s been raining, and the pavement is slippery, or it might be icy. It might be a spillage of oil on a surface, or you’d imagine something slippery like ball bearings on a surface.

So, there’s something that’s caused the surface to become slippery. A person slips – that’s the hazard. Now the person may catch themselves; they may not fall over. They may suffer no injury at all. Or they might fall and suffer a slight injury; and, very occasionally, they might suffer a severe injury. It depends on many different factors. You can imagine if you slipped while going downstairs, you’re much more likely to be injured.

And younger, healthy, fit people are more likely to get over a fall without being injured, whereas if they’re very elderly and frail, a fall can quite often result in a broken bone. If an elderly person breaks a bone in a fall the chances of them dying within the next 12 months are quite high. They’re about one in three.

So, the level of risk is sensitive to a lot of different factors. To get an accurate picture, an accurate estimate of risk, we’re going to need to factor in all those things. But before we get to that, we’ve already said that hazard need not lead to harm. In this standard, we call it an incident, where a hazard has occurred; it could have progressed to an accident but didn’t, we call this an incident. A near miss.

We got away with it. We were lucky. Whatever you want to call it. We’ve had an incident but no he’s been hurt. Hopefully, that incident is being reported, which will help us to prevent an actual accident in future.  That’s another very useful concept that reminds us that not all hazards result in harm. Sometimes there will be no accident. There will be no harm simply because we were lucky, or because someone present took some action to prevent harm to themselves or others.

Mitigation Strategies (Controls)

But we would really like to deliberately design out or avoid Hazards if we can. What we need is a mitigation strategy, we need a measure or measures that, when we put them into practice, reduce that risk. Normally, we call these things controls. Again, now we’ve illustrated this; we’ve added to the funnels. We’ve added some mitigation strategies and they are the dark blue dashed lines.

And they are meant to represent Barriers that prevent the accident sequence progressing towards harm. And they have dashed lines because very few controls are perfect, you know everything’s got holes in it. And we might have several of them. But usually, no control will cover all possible causes; and very few controls will deal with all possible consequences.  That’s what those barriers are meant to illustrate.

That idea that picture will be very useful to us later. When we are thinking about how we’re going to estimate and evaluate risk overall and what risk reduction we have achieved. And how we talk about justifying what we’ve done is good. That’s a very powerful illustration. Well, let’s move on to safety requirements.

Safety Requirements

Now. I guess it’s no great surprise to say that requirements, once met, can contribute directly to the safety of the system. Maybe we’ve got a safety requirement that says all cars will be fitted with seatbelts. Let’s say we’ll be required to wear a seatbelt.  That makes the system safer.

Or the requirement might be saying we need to provide evidence of the safety of the system. And, the requirement might refer to a process that we’ve got to go through or a set kind of evidence that we’ve got to provide. Safety requirements can cover either or both of these.

The Essence of System Safety

Requirements. Covering. Safety of the system or demonstrating that the system is safe. Should give us assurance, which is adequate confidence or justified confidence. Supported with evidence by following a process. And we’ll talk more about process. We meet safety requirements. We get assurance that we’ve done the right thing. And this really brings us to the essence of what system safety is, we’ve got all these requirements – everything is a requirement really – including the requirement. To demonstrate risk reduction.

And those requirements may apply to the system itself, the product. Or they may provide, or they may apply to the process that generates the evidence or the evidence. Putting all those things together in an organized and orderly way really is the essence of system safety, this is where we are addressing safety in a systematic way, in an orderly way. In an organized way. (Those words will keep coming back). That’s the essence of system safety, as opposed to the day-to-day task of keeping a workplace safe.

Maybe by mopping up spills and providing handrails, so people don’t slip over. Things like that. We’re talking about a more sophisticated level of safety. Because we have a more complex problem a more challenging problem to deal with. That’s system safety. We will start on the process now, and we begin with hazard identification and analysis; first, we need to identify and list the hazards, the Hazards and the accidents associated with the system.

We’ve got a system, physical or not. What could go wrong? We need to think about all the possibilities. And then having identified some hazards we need to start doing some analysis, we follow a process. That helps us to delve into the detail of those hazards and accidents. And to define and understand the accident sequences that could result. In fact, in doing the analysis we will very often identify some more hazards that we hadn’t thought of before, it’s not a straight-through process it tends to be an iterative process.

Risk Reduction

And what ultimately what we’re trying to do is reduce risk, we want a systematic process, which is what we’re describing now. A systematic process of reducing risk. And at some point, we must estimate the risk that we’re left with. Before and after all these controls, these mitigations, are applied. That’s risk estimation.  Again, there’s that systematic word, we’re going to use all the available information to estimate the level of risk that we’ve got left. Recalling that risk is a combination of severity and likelihood.

Now as we get towards the end of the process, we need to evaluate risk against set criteria. And those criteria vary depending on which country you’re operating in or which industry we’re in: what regulations apply and what good practice is relevant. All those things can be a factor. Now, in this case, this is a U.K. standard, so we’ve got two tests for evaluating risk. It’s a systematic determination using all the available evidence. And it should be an objective evaluation as far as we can make it.

Risk Evaluation

We should use certain criteria on whether a risk can be accepted or not. And in the U.K. there are two tests for this. As we’ve said before, there is ALARP, the ‘As Low As is Reasonably Practicable’ test, which says: Have we put into practice all reasonably practicable controls? (To reduce risk, this is risk reduction target). And then there’s an absolute level of risk to consider as well. Because even if we’ve taken all practical measures, the risk remaining might still be so high as to be unacceptable to the law.

Now that test is specific to the U.K, so we don’t have to worry too much about it. The point is there are objective criteria, which we must test ourselves or measure ourselves against. An evaluation that will pop out the decision, as to whether a further risk reduction is necessary if the risk level is still too high. We might conclude that are still reasonably practicable measures that we could take. Then we’ve got to do it.

We have an objective decision-making process to say: have we done enough to reduce risk? And if not, we need to do some more until we get to the point where we can apply the test again and say yes, we’ve done enough. Right, that’s rather a long-winded way of explaining that. I apologize, but it is a key issue and it does trip up a lot of people.

Risk Acceptance

Now, once we’ve concluded that we’ve done enough to reduce risk and no further risk reduction is necessary, somebody should be in a position to accept that risk.  Again, it’s a systematic process, by which relevant stakeholders agree that risks may be accepted. In other words, somebody with the right authority has said yes, we’re going to go ahead with the system and put it into practice, implement it. The resulting risks to people are acceptable, providing we apply the controls.

And we accept that responsibility.  Those people who are signing off on those risks are exposing themselves and/or other people to risk. Usually, they are employees, but sometimes members of the public as well, or customers. If you’re going to put customers in an airliner you’re saying yes there is a level of risk to passengers, but that the regulator, or whoever, has deemed [the risk] to be acceptable. It’s a formal process to get those risks accepted and say yes, we can proceed. But again, that varies greatly between different countries, between different industries. Depending on what regulations and laws and practices apply. (We’ll talk about different applications in another section.)

Risk Management

Now putting all this together we call this risk management.  Again, that wonderful systematic word: a systematic application of policies, procedures and practices to these tasks. We have hazard identification, analysis, risk estimation, risk evaluation, risk reduction & risk acceptance. It’s helpful to demonstrate that we’ve got a process here, where we go through these things in order. Now, this is a simplified picture because it kind of implies that you just go through the process once.

With a complex system, you go through the process at least once. We may identify further hazards, when we get into Hazard Analysis and estimating risk. In the process of trying to do those things, even as late as applying controls and getting to risk acceptance. We may discover that we need to do additional work. We may try and apply controls and discover the controls that we thought were going to be effective are not effective.

Our evaluation of the level of risk and its acceptability is wrong because it was based on the premise that controls would be effective, and we’ve discovered that they’re not, so we must go back and redo some work. Maybe as we go through, we even discover Hazards that we hadn’t anticipated before. This can and does happen, it’s not necessarily a straight-through process. We can iterate through this process. Perhaps several times, while we are moving forward.

Safety Management

OK, Safety Management. We’ve gone to a higher level really than risk because we’re thinking about requirements as well as risk. We’re going to apply organization, we’re going to applying management principles to achieve safety with high confidence. For the first time we’ve introduced this idea of confidence in what we’re doing. Well, I say the first time, this is insurance isn’t it? Assurance, having justified confidence or appropriate confidence, because we’ve got the evidence. And that might be product evidence too we might have tested the product to show that it’s safe.

We might have analysed it. We might have said well we’ve shown that we follow the process that gives us confidence that our evidence is good. And we’ve done all the right things and identified all the risks.  That’s safety management. We need to put that in a safety management system, we’ve got a defined organization structure, we have defined processes, procedures and methods. That gives us direction and control of all the activities that we need to put together in a combination. To effectively meet safety requirements and safety policy.

And our safety tests, whatever they might be. More and more now we’re thinking about top-level organization and planning to achieve the outcomes we need. With a complex system, with a complex operating environment and a complex application.

Safety Planning

Now I’ll just mention planning. Okay, we need a safety management plan that defines the strategy: how we’re going to get there, how are we going to address safety. We need to document that safety management system for a specific project. Planning is very important for effective safety. Safety is very vulnerable to poor planning. If a project is badly planned or not planned at all, it becomes very difficult to Do safety effectively, because we are dependent on the process, on following a rigorous process to give us confidence that all results are correct.  If you’ve got a project that is a bit haphazard, that’s not going to help you achieve the objectives.

Planning is important. Now the bit of that safety plan that deals with timescales, milestones and other date-related information. We might refer to as a safety program. Now being a UK Definition, British English has two spellings of program. The double-m-e version of programme. Applies to that time-based progression, or milestone-based progression.

Whereas in the US and in Australia, for example, we don’t have those two words we just have the one word, ‘program’. Which Covers everything: computer programs, a programme of work that might have nothing to do with or might not be determined by timescales or milestones. Or one that is. But the point is that certain things may have to happen at certain points in time or before certain milestones. We may need to demonstrate safety before we are allowed to proceed to tests and trials or before we are allowed to put our system into service.

Demonstrating Safety

We’ve got to demonstrate that Safety has been achieved before we expose people to risk.  That’s very simple. Now, finally, we’re almost at the end. Now we need to provide a demonstration – maybe to a regulator, maybe to customers – that we have achieved safety.  This standard uses the concept of a safety case. The safety case is basically, imagine a portfolio full of evidence.  We’ve got a structured argument to put it all together. We’ve got a body of the evidence that supports the argument.

It provides a Compelling, Comprehensible (or understandable) and valid case that a system is safe. For a given application or use, in a given Operating environment.  Really, that definition of what a safety case is harks back to that meaning of safety.  We’ve got something that really hits the nail on the head. And we might put all of that together and summarise it in a safety case report. That summarises those arguments and evidence, and documents progress against the Safe program.

Remember I said our planning was important. We started off saying that we need to do this, that the other in order to achieve safety. Hopefully, in the end, in the safety report we’ll be able to state that we’ve done exactly that. We did do all those things. We did follow the process rigorously. We’ve got good results. We’ve got a robust safety argument. With evidence to support it. At the end, it’s all written up in a report.

Documenting Safety

Now that isn’t always going to be called a safety case report; it might be called a safety assessment report or a design justification report. There are lots of names for these things. But they all tend to do the same kind of thing, where they pull together the argument as to why the system is safe. The evidence to support the argument, document progress against a plan or some set of process requirements from a standard or a regulator or just good practice in an industry to say: Yes, we’ve done what we were expected to do.

The result is usually that’s what justifies [the system] getting past that milestone. Where the system is going into service and can be used. People can be exposed to those risks, but safely and under control.

Everyone’s a winner, as they say!

Copyright – Creative Commons Licence

Okay. I’ve used a lot of information from a UK government website. I’ve done that in accordance with the terms of its creative commons license, and you can see more about that here. We have we complied with that, as we are required to, and to say to you that the information we’ve supplied is under the terms of this license.

Safety Concepts Part 2: More Resources

And for more resources and for more lessons on system safety. And other safe topics. I invite you to visit the safety artisan.com website  Thanks very much for watching. I hope you found that useful.

We’ve covered a lot of information there, but hopefully in a structured way. We’ve repeated the key concepts and you can see that in that standard. The key concepts are consistently defined, and they reinforce each other. In order to get that systematic, disciplined approach to safety, that’s we need.

Anyway, that’s enough from me. I hope you enjoyed watching and found that useful. I look forward to talking to you again soon. Please send me some feedback about what you thought about this video and also what you would like to see covered in the future.

Thank you for visiting The Safety Artisan. I look forward to talking to you again soon. Goodbye.

Safety Concepts Part 1 defines the meaning of ‘Safe’, and it is free. Return to the Start Here Page.

Categories
Start Here System Safety

System Safety Principles

In this 45-minute video, I discuss System Safety Principles, as set out by the US Federal Aviation Authority in their System Safety Handbook. Although this was published in 2000, the principles still hold good (mostly) and are worth discussing. I comment on those topics where modern practice has moved on, and those jurisdictions where the US approach does not sit well.

This is the ten-minute preview of the full, 45-minute video.

System Safety Principles: Topics

  • Foundational statement
  • Planning
  • Management Authority
  • Safety Precedence
  • Safety Requirements
  • System Analyses Assumptions & Criteria
  • Emphasis & Results
  • MA Responsibilities
  • Software hazard analysis
  • An Effective System Safety Program

System Safety Principles: Transcript

Click here for the Transcript

Hello and welcome to The Safety Artisan where you will find professional pragmatic and impartial educational products. I’m Simon and it’s the 3rd of November 2019. Tonight I’m going to be looking at a short introduction to System Safety Principles.

Introduction

On to system safety principles; in the full video we look at all principles from the U.S. Federal Aviation Authority’s System Safety Handbook but in this little four- or five-minute video – whatever it turns out to be – we’ll take a quick look just to let you know what it’s about.

Topics for this Session

These are the subjects in the full session. Really a fundamental statement; we talk about planning; talk about the management authority (which is the body that is responsible for bringing into existence -in this case- some kind of aircraft or air traffic control system, something like that, something that the FAA would be the regulator for in the US). We talk about safety precedents. In other words, what’s the most effective safety control to use. Safety requirements; system analyses – which are highlighted because that’s just the sample I’m going to talk about, tonight; assumptions and safety criteria; emphasis and results – which is really about how much work you put in where and why; management authority responsibilities; a little aside of a specialist area – software hazard analysis; And finally, what you need for an effective System Safety Program.

Now, it’s worth mentioning that this is not an uncritical look at the FAA handbook. It is 19 years old now so the principles are still good, but some of it’s a bit long in the tooth. And there are some areas where, particularly on software, things have moved on. And there are some areas where the FAA approach to system safety is very much predicated on an American approach to how these things are done.  

Systems Analysis

So, without further ado, let’s talk about system analysis. There are two points that the Handbook makes. First of all, that these analyses are basic tools for systematically developing design specifications. Let’s unpack that statement. So, the analyses are tools- they’re just tools. You’ve still got to manage safety. You’ve still got to estimate risk and make decisions- that’s absolutely key. The system analyses are tools to help you do that. They won’t make decisions for you. They won’t exercise authority for you or manage things for you. They’re just tools.

Secondly, the whole point is to apply them systematically. So, coverage is important here- making sure that we’ve covered the entire system. And also doing things in a thorough and orderly fashion. That’s the systematic bit about it. And then finally, it’s about developing design specifications. Now, this is where the American emphasis comes in. But before we talk about that, it’s fundamental to note that really we need to work out what our safety requirements are. What are we trying to achieve here with safety? And why? And those are really important concepts because if you don’t know what you’re trying to achieve then it will be very difficult to get there and to demonstrate that you’ve got there- which is kind of the point of safety. And putting effort into getting the requirements right is very important because without doing that first step all your other work could be invalid. And in my experience of 20 plus years in the business, if you don’t have a really precise handle on what you’re trying to achieve then you’re going to waste a lot of time and money, probably.

So, onto the second bullet point. Now the handbook says that the ultimate measure of safety is not the scope of analysis but in satisfying requirements. So, the first part – very good. We’re not doing analysis for the sake of it. That’s not the measure of safety – that we’ve analyzed something to death or that we’ve expended vast amounts of dollars on doing this work but that we’ve worked out the requirements and the analysis has helped us to meet them. That is the key point.

This is where it can go slightly pear-shaped in that this emphasis on requirements (almost to the exclusion of anything else) is a very U.S.-centric way of doing things. So, very much in the US, the emphasis is you meet the spec, you certify that you’ve met spec and therefore we’re safe. But of course what if the spec is wrong? Or what if it’s just plain inappropriate for a new use of an existing system or whatever it might be?

In other jurisdictions, notably the U.K. (and as you can tell from my accent that’s where I’m from,  I’ve got a lot of experience doing safety work in the U.K. but also Australia where I now live and work) it’s not about meeting requirements. Well, it is but let me explain. In the UK and Australia, English law works on the idea of intent. So, we aim to make something safe: not whether it has that it’s necessarily met requirements or not, that doesn’t really matter so much, but is the risk actually reduced to an acceptable level? There are tests for deciding what is acceptable. Have you complied with the law? The law outside the US can take a very different approach to “it’s all about the specification”.

Of course, those legal requirements and that requirement to reduce risk to an acceptable level, are, in themselves, requirements. But in Australian or British legal jurisdiction, you need to think about those legal requirements as well. They must be part of your requirements set. So, just having a specification for a technical piece of cake that ignores the requirements of the law, which include not only design requirements but the thing is actually safe in service and can be safely introduced, used, disposed of, etc. If you don’t take those things into account you may not meet all your obligations under that system of law. So, there’s an important point to understanding and using American standards and an American approach to system safety out of the assumed context. And that’s true of all standards and all approaches but it’s a point I bring out in the main video quite forcefully because it’s very important to understand.

Copyright Statement

So, that’s the one subject I’m going to talk about in this short video. I’d just like to mention that all quotations are from the FAA system safety handbook which is copyright free but the content of this video presentation, including the added value from my 20 plus years of experience, is copyright of the Safety Artisan.

For More…

And wherever you’re seeing this video, be it on social media or whatever, you can see the full version of the video and all other videos at The Safety Artisan.

End

That’s the end of the show. It just remains to me to say thanks very much for giving me your time and I look forward to talking to you again soon. Bye-bye.

Back to the Start Here Page.

Categories
Safe Design Start Here

Good Work Design

The content of this post is taken from the ‘Principles of Good Work Design’ handbook from Safe Work Australia. The handbook is © Commonwealth of Austr​alia, 2019; this document is covered by a Creative Commons licence (CCBY 4.0) – for full details see here.

Some changes have been made to the guidance in order to improve Search Engine Optimisation and correct minor problems with Figure numbering in the original document. All changes are indicated [thus].

Introduction

The Australian Work Health and Safety Strategy 2012-2022 is underpinned by the principle that well-designed healthy and safe work will allow workers to have more productive lives. This can be more efficiently achieved if hazards and risks are eliminated through good design.

The ten principles of good work design

This handbook contains ten principles which demonstrate how to achieve good design of work and work processes. Each is general in nature so they can be successfully applied to any workplace, business or industry.

The ten principles for good work design are structured into three sections:

  1. Why good work design is important
  2. What should be considered in good work design, and
  3. How good work is designed

These principles are shown in the diagram at Figure 1.

This handbook complements a range of existing resources available to businesses and work health and safety professionals including guidance for the safe design of plant and structures see the Safe Work Australia Website.

Scope of the handbook

This handbook provides information on how to apply the good work design principles to work and work processes to protect workers and others who may be affected by the work. 

It describes how design can be used to set up the workplace, working environment and work tasks to protect the health and safety of workers, taking into account their range of abilities and vulnerabilities, so far as reasonably practicable.

The handbook does not aim to provide advice on managing situations where individual workers may have special requirements such as those with a disability or on a return to work program following an injury or illness. Contact your regulator for further information.

Who should use this handbook?

This handbook should be used by those with a role in designing work and work processes, including:

  • Persons conducting a business or undertaking (PCBUs) with a primary duty of care under the model Work Health and Safety (WHS) laws.
  • PCBUs who have specific design duties relating to the design of plant, substances and structures including the buildings in which people work.
  • People responsible for designing organisational structures, staffing rosters and systems of work.
  • Professionals who provide expert advice to organisations on work health and safety matters.

Good work design optimises work health and safety, human performance, job satisfaction, and business success.

Information: Experts who provide advice on the design of work may include: engineers, architects, ergonomists, information and computer technology professionals, occupational hygienists, organisational psychologists, human resource professionals, occupational therapists and physiotherapists.

Figure 1 – Good work design principles

An image of good work design principles

What is ‘good work’?

‘Good work’ is healthy and safe work where the hazards and risks are eliminated or minimised so far as is reasonably practicable. Good work is also where the work design optimises human performance, job satisfaction and productivity.

Good work contains positive work elements that can:

  • protect workers from harm to their health, safety and welfare
  • improve worker health and wellbeing, and
  • improve business success through higher worker productivity.

What is good work design?

The most effective design process begins at the earliest opportunity during the conceptual and planning phases. At this early stage there is the greatest chance of finding ways to design-out hazards, incorporate effective risk control measures and design-in efficiencies.

Effective design of good work considers:

The work:

  • how work is performed, including the physical, mental and emotional demands of the tasks and activities
  • the task duration, frequency, and complexity, and
  • the context and systems of work.

The physical working environment:

  • the plant, equipment, materials and substances used, and
  • the vehicles, buildings, structures that are workplaces.

The workers:

  • physical, emotional and mental capacities and needs.

Effective design of good work can radically transform the workplace in ways that benefit the business, workers, clients and others in the supply chain.

Failure to consider how work is designed can result in poor risk management and lost opportunities to innovate and improve the effectiveness and efficiency of work.

The principles for good work design support duty holders to meet their obligations under the WHS laws and also help them to achieve better business practice generally.

For the purposes of this handbook a work designer is anyone who makes decisions about the design or redesign of work. This may be driven by the desire to improve productivity as well as the health and safety of people who will be doing the work

The WHY Principles

Why is good work design important?

Principle 1: Good work design gives the highest level of protection so far as is reasonably practicable

  • All workers have a right to the highest practicable level of protection against harm to their health, safety and welfare.
  • The primary purpose of the WHS laws is to protect persons from work-related harm so far as is reasonably practicable.
  • Harm relates to the possibility that death, injury, illness or disease may result from exposure to a hazard in the short or longer term.
  • Eliminating or minimising hazards at the source before risks are introduced in the workplace is a very effective way of providing the highest level of protection.

Principle 1 refers to the legal duties under the WHS laws. These laws provide the framework to protect the health, safety and welfare of workers and others who might be affected by the work. During the work design process workers and others should be given the highest level of protection against harm that is reasonably practicable.

Prevention of workplace injury and illness

Well-designed work can prevent work-related deaths, injuries and illnesses. The potential risk of harm from hazards in a workplace should be eliminated through good work design.

Only if that is not reasonably practicable, then the design process should minimise hazards and risks through the selection and use of appropriate control measures.

New hazards may inadvertently be created when changing work processes. If the good work design principles are systematically applied, potential hazards and risks arising from these changes can be eliminated or minimised.

Information: Reducing the speed of an inappropriately fast process line will not only reduce production errors, it can diminish the likelihood of a musculoskeletal injury and mental stress.

Principle 2: Good work design enhances health and wellbeing

  • Health is a “state of complete physical, mental, and social wellbeing, not merely the absence of disease or infirmity” (World Health Organisation).
  • Designing good work can help improve health over the longer term by improving workers’ musculoskeletal condition, cardiovascular functioning and their mental health.
  • Good work design optimises worker function and improves participation enabling workers to have more productive working lives.

Health benefits

Effective design aims to prevent harm, but it can also positively enhance the health and wellbeing of workers for example, satisfying work and positive social interactions can help improve people’s physical and mental health.

As a general guide, the healthiest workers have been found to be three times more productive than the least healthy (PDF file). It therefore makes good business sense for work design to support people’s health and wellbeing.

Information: Recent research has shown long periods of sitting (regardless of exercise regime) can lead to increased risk of preventable musculoskeletal disorders and chronic diseases such as diabetes. In an office environment, prolonged sitting can be reduced by allowing people to alternate between sitting or standing whilst working.

Principle 3: Good work design enhances business success and productivity

  • Good work design prevents deaths, injuries and illnesses and their associated costs, improves worker motivation and engagement and in the long-term improves business productivity.
  • Well-designed work fosters innovation, quality and efficiencies through effective and continuous improvement.
  • Well-designed work helps manage risks to business sustainability and profitability by making work processes more efficient and effective and by improving product and service quality.

Cost savings and productivity improvements

Designing-out problems before they arise is generally cheaper than making changes after the resulting event, for example by avoiding expensive retrofitting of workplace controls.

Good work design can have direct and tangible cost savings by decreasing disruption to work processes and the costs from workplace injuries and illnesses.

Good work design can also lead to productivity improvements and business sustainability by:

  • allowing organisations to adjust to changing business needs and to streamline work processes by reducing wastage, training and supervision costs
  • improving opportunities for creativity and innovation to solve production issues, reduce errors and improve service and product quality, and
  • making better use of workers’ skills resulting in more engaged and motivated staff willing to contribute greater additional effort.
A diagram of the why principles
[Figure 1.1, Good Work Design Hleath Benefits]

The WHAT Principles

What should be considered by those with design responsibilities?

Principle 4: Good work design addresses physical, biomechanical, cognitive and psychosocial characteristics of work, together with the needs and capabilities of the people involved

  • Good work design addresses the different hazards associated with work e.g. chemical, biological and plant hazards, hazardous manual tasks and aspects of work that can impact on mental health.
  • Work characteristics should be systematically considered when work is designed, redesigned or the hazards and risks are assessed.
  • These work characteristics should be considered in combination and one characteristic should not be considered in isolation.
  • Good work design creates jobs and tasks that accommodate the abilities and vulnerabilities of workers so far as reasonably practicable.

All tasks have key characteristics with associated hazards and risks, as shown in Figure 2 below:

Figure 2 – Key characteristics of work


Hazards and risks associated with tasks are identified and controlled during good work design processes and they should be considered in combination with all hazards and risks in the workplace. This highlights that it is the combination that is important for good work design.

Workers can also be exposed to a number of different hazards from a single task. For example, meat boning is a common task in a meat-processing workplace. This task has a range of potential hazards and risks that need to be managed, e.g. physical, chemical, biological, biomechanical and psychosocial. Good work design means the hazards and risks arising from this task are considered both individually and collectively to ensure the best control solutions are identified and applied.

Good work design can prevent unintended consequences which might arise if task control measures are implemented in isolation from other job considerations. For example, automation of a process may improve production speed and reduce musculoskeletal injuries but increase risk of hearing loss if effective noise control measures are not also considered.

Workers have different needs and capabilities; good work design takes these into account. This includes designing to accommodate them given the normal range of human cognitive, biomechanical and psychological characteristics of the work.

Information: The Australian workforce is changing. It is typically older with higher educational levels, more inclusive of people with disabilities, and more socially and ethnically diverse. Good work design accommodates and embraces worker diversity. It will also help a business become an employer of choice, able to attract and retain an experienced workforce.

Principle 5: Good work design considers the business needs, context and work environment.

  • Good work design is ‘fit for purpose’ and should reflect the needs of the organisation including owners, managers, workers and clients.
  • Every workplace is different so approaches need to be context specific. What is good for one situation cannot be assumed to be good for another, so off-the-shelf solutions may not always suit every situation.
  • The work environment is broad and includes: the physical structures, plant and technology, work layout, organisational design and culture, human resource systems, work health and safety processes and information/control systems.

The business organisational structure and culture, decision making processes, work environment and how resources and people are allocated to the work will directly and indirectly impact on work design and how well and safely the work is done.

The work environment includes the physical structures, plant, and technology. Planning for relocations, refurbishments or when introducing new engineering systems are ideal opportunities for businesses to improve their work designs and avoid foreseeable risks.

These are amongst the most common work changes a business undertakes yet good design during these processes is often quite poorly considered and implemented. An effective design following the processes described in this handbook can yield significant business benefits.

Information: Off the shelf solutions can be explored for some common tasks, however usually design solutions need to be tailored to suit a particular workplace.

Good work design is most effective when it addresses the specific business needs of the individual workplace or business. Typically work design solutions will differ between small and large businesses.

However, all businesses must eliminate or minimise their work health and safety risks so far as reasonably practicable. The specific strategies and controls will vary depending on the circumstances.

The table on the next page demonstrates how to step through the good work design process for small and large businesses.

Table 1 – steps in good work design for large and small businesses

Good design steps In a large business that is downsizing In a small business that is undergoing a refit
Management commitment Senior management make their commitment to good work design explicit ahead of downsizing and may hire external expertise.   The owner tells workers about their commitment to designing-out hazards during the upcoming refit of the store layout to help improve safety and efficiency.  
Consult The consequences of downsizing and how these can be managed are discussed in senior management and WHS committee meetings with appropriate representation from affected work areas.   The owner holds meetings with their workers to identify possible issues ahead of
the refit.  
Identify A comprehensive workload audit is undertaken to clarify opportunities for improvements.   The owner discusses the proposed refit with the architect and builder and gets ideas for dealing with issues raised by workers.  
Assess A cost benefit analysis is undertaken to assess the work design options to manage the downsizing.   The owner, architect and builder jointly discuss the proposed refit and any worker issues directly with workers.   
Control A change management plan is developed and implemented to appropriately structure teams and improve systems of work. Training is provided to support the new work arrangements.   The building refit occurs. Workers are given training and supervision to become familiar with new layout and safe equipment use.  
Review The work redesign process is reviewed against the project aims by senior managers.   The owner checks with the workers that the refit has improved working conditions and efficiency and there are no new issues.  
Improve Following consultation, refinement of the redesign is undertaken if required.   Minor adjustments to the fit out are made if required.  

Principle 6: Good work design is applied along the supply chain and across the operational lifecycle.

  • Good work design should be applied along the supply chain in the design, manufacture, distribution, use and disposal of goods and the supply of services.
  • Work design is relevant at all stages of the operational life cycle, from start-up, routine operations, maintenance, downsizing and cessation of business operations.
  • New initiatives, technologies and change in organisations have implications for work design and should be considered.

Information: Supply chains are often made up of complex commercial or business relationships and contracts designed to provide goods or services. These are often designed to provide goods or services to a large, dominant business in a supply chain. The human and operational costs of poor design by a business can be passed up or down the supply chain.

Businesses in the supply chain can have significant influence over their supply chain partners’ work health and safety through the way they design the work.

Businesses may create risks and so they need to be active in working with their supply chains and networks to solve work health and safety problems and share practical solutions for example, for common design and manufacturing problems.

Health and safety risks can be created at any point along the supply chain, for example, loading and unloading causing time pressure for the transport business.

There can be a flow-on effect where the health and safety and business ‘costs’ of poor design may be passed down the supply chain. These can be prevented if businesses work with their supply chain partners to understand how contractual arrangements affect health and safety.

Procurement and contract officers can also positively influence their own organisation and others work health and safety throughout the supply chain by the good design of contracts. 

When designing contractual arrangements businesses could consider ways to support good work design safety outcomes by:

  • setting clear health and safety expectations for their supply chain partners, for example through the use of codes of conduct or quality standards
  • conducting walk through inspections, monitoring and comprehensive auditing of supply chain partners to check adherence to these codes and standards
  • building the capability of their own procurement staff to understand the impacts of contractual arrangements on their suppliers, and
  • consulting with their supply chain partners on the design of good work practices.

Information: The road transport industry is an example of the application of how this principle can help improve drivers’ health and safety and address issues arising from supply chain arrangements. For example, the National Heavy Vehicle Laws ‘chain of responsibility’ requires all participants in the road transport supply chain to take responsibility for driver work health and safety. Contracts must be designed to allow drivers to work reasonable hours, take sufficient breaks from driving and not have to speed to meet deadlines.

The design of products will strongly impact on both health and safety and business productivity throughout their lifecycles. At every stage there are opportunities to eliminate or minimise risks through good work design. The common product lifecycle stages are illustrated in Figure 3 below.

Figure 3 – common product lifecycle

A diagram of common product lifecycle

Information: For more information on the design of structures and of plant see ‘Safe design of structures’ and Managing the risks of plant in the workplace and other design guidance on the Safe Work Australia website.

The good work design principles are also relevant at all stages of the business life cycle. Some of these stages present particularly serious and complex work health and safety challenges such as during the rapid expansion or contraction of businesses. Systematic application of good work design principles during these times can achieve positive work health and safety outcomes.

View the Bureau of Meteorology case study on fatigue management.

New technology is often a key driver of change in work design. It has the potential to improve the quality of outputs, efficiency and safety of workers, however introducing new technology could also introduce new hazards and unforeseen risks. Good work design considers the impact of the new initiatives and technologies before they are introduced into the workplace and monitors their impact over time.

Information: When designing a machine for safe use, how the maintenance will be undertaken in the future should be considered.

In most workplaces the information and communication technology (ICT) systems are an integral part of all business operations. In practice these are often the main drivers of work changes but are commonly overlooked as sources of workplace risks. Opportunities to improve health and safety should always be considered when new ICT systems are planned and introduced.

A diagram of the WHAT principles
[Figure 4, The ICT Triad]

The HOW Principles

Principle 7: Engage decision makers and leaders

  • Work design or redesign is most effective when there is a high level of visible commitment, practical support and engagement by decision makers.
  • Demonstrating the long-term benefits of investing in good work design helps engage decision makers and leaders.
  • Practical support for good work design includes allocation of appropriate time and resources to undertake effective work design or redesign processes.

Information: Leaders are the key decision makers or those who influence the key decision makers. Leaders can be the owners of a business, directors of boards and senior executives.

Leaders can support good work design by ensuring the principles are appropriately included or applied, for example in:

  • key organisational policies and procedures
  • proposals and contracts for workplace change or design
  • managers’ responsibilities and as key performance indicators
  • business management systems and audit reports
  • organisational communications such as a standing item on leadership meeting agendas, and
  • the provision of sufficient human and financial resources.

Good work design, especially for complex issues will require adequate time and resources to consider and appropriately manage organisational and/or technological change. Like all business change, research shows leader commitment to upfront planning helps ensure better outcomes.

Managers and work health and safety advisors can help this process by providing their leaders with appropriate and timely information. This could include for example:

  • identifying design options which support both business outcomes and work health and safety objectives
  • assessing the risks and providing short and long term cost-benefit analysis of the recommended controls to manage these risks, and
  • identifying what decisions need to be taken, when and by whom to effectively design and implement the agreed changes.

Principle 8: Actively involve the people who do the work, including those in the supply chain and networks

  • Persons conducting a business or undertaking (PCBUs) must consult with their workers and others likely to be affected by work in accordance with the work health and safety laws.
  • Supply chain stakeholders should be consulted as they have local expertise about the work and can help improve work design for upstream and downstream participants.
  • Consultation should promote the sharing of relevant information and provide opportunities for workers to express their views, raise issues and contribute to decision making where possible.

Effective consultation and co-operation of all involved with open lines of communication, will ultimately give the best outcomes. Consulting with those who do the work not only makes good sense, it is required under the WHS laws.

Information: Under the model WHS laws (s47), a business owner must, so far as is reasonably practicable, consult with ‘workers who carry out work for the business or undertaking who are, or are likely to be, directly affected by a matter relating to work health or safety.’ This can include a work design issue.

If more than one person has a duty in relation to the same matter, ‘each person with the duty must, so far as is reasonably practicable, consult, co-operate and co-ordinate activities with all other persons who have a duty in relation to the same matter’ (model WHS laws s46).

Workers have knowledge about their own job and often have suggestions on how to solve a specific problem. Discussing design options with them will help promote their ownership of the changes. See Code of practice on consultation.

Businesses that operate as part of a supply chain should consider whether the work design and changes to the work design might negatively impact on upstream or downstream businesses. The supply chain partners will often have solutions to logistics problems which can benefit all parties.

Principle 9: Identify hazards, assess and control risks, and seek continuous improvement

  • A systematic risk management approach should be applied in every workplace.
  • Designing good work is part of the business processes and not a one-off event.
  • Sustainability in the long-term requires that designs or redesigns are continually monitored and adjusted to adapt to changes in the workplace so as to ensure feedback is provided and that new information is used to improve design.

Good work design should systematically apply the risk management approach to the workplace hazards and risks. See Principle 4 or more details.

Typically good work design will involve ongoing discussions with all stakeholders to keep refining the design options.  Each stage in the good work design process should have decision points for review of options and to consult further if these are not acceptable. This allows for flexibility to quickly respond to unanticipated and adverse outcomes.

Figure 5 outlines how the risk management steps can be applied in the design process

Continuous improvements in work health and safety can in part be achieved if the good work design principles are applied at business start up and whenever major organisational changes are contemplated. To be most effective, consideration of health and safety issues should be integrated into normal business risk management.

Figure 5 – Steps in the good work design process

A diagram of steps in the good work design process

Principle 10: Learn from experts, evidence, and experience

  • Continuous improvement in work design and hence work health and safety requires ongoing collaboration between the various experts involved in the work design process.
  • Various people with specific skills and expertise may need to be consulted in the design stage to fill any knowledge gaps. It is important to recognise the strengths and limitations of a single expert’s knowledge.
  • Near misses, injuries and illnesses are important sources of information about poor design.

Most work design processes will require collaboration and cooperation between internal and sometimes external experts. Internal advice can be sought from workers, line managers, technical support and maintenance staff, engineers, ICT systems designers, work health and safety advisors and human resource personnel.

Depending on the design issue, external experts may be required such as architects, engineers, ergonomists, occupational hygienists and psychologists.

Information: If you provide advice on work design options it is important to know and work within the limitations of your discipline’s knowledge and expertise. Where required make sure you seek advice and collaborate with other appropriate design experts.

For complex and high-risk projects, ideally a core group of the same people should remain involved during both the design and implementation phases with other experts brought in as necessary.

The type of expert will always depend on the circumstances. When assessing the suitability of an expert consider their qualifications, skills, relevant knowledge, technical expertise, industry experience, reputation, communication skills and membership of professional associations.

Information:  Is the consultant suitably qualified?
A suitably qualified person has the knowledge, skills and experience to provide advice on the specific design issue. You can usually check with the professional association to see if the consultant is certified or otherwise recognised by them to provide work design advice.

The decision to design or redesign work should be based on sound evidence. Typically this evidence will come from many sources such as both proactive and reactive indicators, information about a new technology or the business decisions to downsize, expand or restructure or to meet the requirements of supply chain partners.

Proactive and reactive indicators can also be used to monitor the effectiveness and efficiency of the design solution.

Information: Proactive indicators provide early information about the work system that can be used to prevent accidents or harm. These might include for example: key process variables such as temperature or workplace systems indicators such as the number of safety audits and inspections undertaken.

Reactive indicators are usually based on incidents that have already occurred. Examples include number and type of near misses and worker injury and illness rates.

Useful information about common work design problems and solutions can also often be obtained from:

  • work health and safety regulators
  • industry associations and unions
  • trade magazines and suppliers, and
  • specific research papers.
A diagram of the HOW principles
[Figure 5.1, Sources of Work Design Information]

[Good Work Design] Summary

The ten principles of good work design can be applied to help support better work health and safety outcomes and business productivity. They are deliberately high level and should be broadly applicable across the range of Australian businesses and workplaces. Just as every workplace is unique, so is the way each principle can be applied in practice.

When considering these principles in any work design also ensure you take into account your local jurisdictional work health and safety requirements.

[END: Good Work Design]

Back to Home Page