Categories
Snapshot

Welcome to the New Website!

Welcome to the New Website! It has been professionally redesigned to provide a much better user experience by the awesome Sam Jusaitis. Our thanks to him for doing such a great job.

The Main Pages

You can now browse through the main pages, which give you all the content that you might need, in the order that you choose it:

  • Topics. This page showcases the main safety topics that we cover, so far they are:
    • Start Here. Mostly free introductory videos for those new to safety;
    • Safety Analysis. A complete and in-depth suite of lessons on this subject; and
    • Work Health & Safety. All you need to know about Australian WHS legislation and practice.
  • About. Some information about The Safety Artisan – why you would choose safety tuition from me.
  • Connect. Here, you can sign up for free email newsletters, subscribe to our YouTube Channel, and follow us on social media.
  • Checkout. You’ll get there if you purchase any of the downloadable videos and content – but there’s plenty of free stuff too!

Coming Soon: FAQ

All the most Frequently Asked Questions on Google about safety, risk, and related topics. Answers too!

Welcome to the New Website Logo

Sam also designed the new logo, which reminds some people of the human eye. It was actually derived from the shapes of various warning signs, as shown below. Clever, eh?

Categories
Mil-Std-882E Safety Analysis

System of Systems Hazard Analysis

In this full-length (38-minute) session, The Safety Artisan looks at System of Systems Hazard Analysis, or SoSHA, which is Task 209 in Mil-Std-882E. SoSHA analyses collections of systems, which are often put together to create a new capability, which is enabled by human brokering between the different systems. We explore the aim, description, and contracting requirements of this Task, and an extended example to illustrate SoSHA. (We refer to other lessons for special techniques for Human Factors analysis.)

This is the seven-minute demo version of the full 38-minute video.

System of Systems Hazard Analysis: Topics

  • System of Systems (SoS) HA Purpose;
  • Task Description (2 slides);
  • Documentation (2 slides);
  • Contracting (2 slides);
  • Example (7 slides); and
  • Summary.

Transcript: System of Systems Hazard Analysis

Click here for the Transcript

Introduction

Hello everyone and welcome to the Safety Artisan. I’m Simon and today we’re going to be talking about System of Systems Hazard Analysis – a bit of a mouthful that. What does it actually mean? Well, we shall see.

System of Systems Hazard Analysis

So, for Systems of Systems Hazard Analysis, we’re using task 209 as the description of what to do taken from a military standard, 882E. But to be honest, it doesn’t really matter whether you’re doing a military system or a civil system, whatever it might be – if you’ve got a system of systems, then this will help you to do it.

Topics for this Session

Looking at what we’ve got coming up.

So, we look at the purpose of system of systems – and by the way, if you’re wondering what that is what I’m talking about is when we take different things that we’ve developed elsewhere, e.g. platforms, electronic systems, whatever it might be, and we put them together. Usually, with humans gluing the system together somewhere, it must be said, to make it all tick and fit together. Then we want this collection of systems to do something new, to give us some new capability, that we didn’t have before. So, that’s what I’m talking about when I say a system of systems. I’ll show you an example – it’s the best way. So, we’ve got a couple of slides on task description, a couple of slides or documentation, and a couple of slides on contracting. Tasks 209 is a very short task, and therefore I’ve decided to go through an example.

So, we’ve got seven slides of an example of a system of systems, safety case, and safety case report that I wrote. And hopefully, that will illustrate far better than just reading out the description. And that will also give us some issues that can emerge with systems of systems and I’ll summarize those at the end.

SOSHA Purpose

So, let’s get on. I’m going to call it the SOSHA for short; Systems of Systems Hazard Analysis. The purpose of the SOSHA, task 209, is to document or perform and document the analysis of the system of systems and identify unique system of systems hazards. So, things we don’t get from each system in isolation. This task is going to produce special requirements to deal with these hazards, which otherwise would not exist. Because until we put the things together and start using them for something new – We’ve not done this before.

Task Description (T209) #1

Task description: As in all of these tasks, the contractor shall perform and document an analysis of the system of systems to identify hazards and mitigation requirements. A big part of this, as I said earlier, we tend to use people to glue these collections, these portfolios, of systems together and humans are fantastic at doing that. Not always the ideal way of doing it, but sometimes it’s the only way of doing it within the constraints that we’ve got. The human is very important. The human will receive inputs from one or more systems and initiate outputs within the analysis and in fact within the real world, to be honest, which is what we’re trying to analyse. That’s probably a better way of looking at it.

And we’ve got to provide traceability of all those hazards to – it says – architecture locations, interfaces, data and stakeholders associated with the hazard. This is particularly important because with a system of systems each system tends to come with its own set of stakeholders, its own physical location, its own interfaces, etc, etc. The issue of managing all of those extraneous things and getting the traceability, it goes up. It is multiplied with every system you’ve got. In fact, I would say it was the square of. The example we’ll see: we’ve got three systems being put together in a system of systems and, in effect, we had nine times the amount of work in that area, I would say. I think that’s a reasonable approximation.

Task Description (T209) #2

Part two of the task description: The contractor will assess the risk of each hazard and recommend mitigation measures to eliminate the hazards. Or, very often, we can’t eliminate the hazards to reduce the associated risks. Then, as always with this standard, it says we’re going to use tables one, two and three, which are the severity, probability and the risk matrix that comes with the standard. Unless, of course, we have created or tailored our own matrix. Which we very often should do but it isn’t often done – I’ll have to do a session on how to do tailoring a matrix.

Then the contractor has got to verify and validate the effectiveness of those recommended mitigation measures. Now, that’s a really good point and I often see that missed. People come up with control measures or mitigation measures but don’t always assess how effective they’re going to be. Sometimes you can’t so we just have to be conservative but it’s not always done as well as it could be.

Documentation (T209) #1

So, let’s move on. Documentation: So, whoever does the analysis- the standard assumes it’s a contractor – shall document the results to include: you’ve got to describe the system of systems, the physical and functional characteristics of the system of systems, which is very important. Capturing these things is not a given. It’s not easy when you’ve got one system, but when you’ve got multiple systems, some of which are being misused to do something they’ve never done before, perhaps, then you’ve got to take extra care.

Then basically it says when you get more detail of the individual systems you need to supply that when it becomes available. Again, that’s important. And not only if the contractor supplies it, who’s going to check it? Who’s going to verify it? Etc., etc.

Documentation (T209) #2

Slide two on documentation: We’ve got to describe the hazard analysis methods and techniques used, providing a description of each method and technique used, and the assumptions and the data used in support. This is important because I’ve seen lots of times where you get a hazard analysis’ results and you only get the results. It’s impossible to verify those results or validate them to say whether they’ve been done in the correct context. And it’s impossible to say whether the results are complete or whether they’re up to date or even whether they were analysing the correct system because often systems come in different versions. So, how do you know that the version being analysed was the version you’re actually going to use? Without that description, you don’t know. So, it’s important to contract for these things.

And then hazard analysis results. What contents and formats do you want? It’s important to say. Also, we’re going to be looking to put the key items, the leading particular’s, from the results. The top-level results are going to go into the hazard tracking system which is more commonly known as a hazard log or a risk register, whatever it might be. Might be an Excel spreadsheet, might be a very fancy database, but whatever it’s going to be you’re going to have to standardize your fields of what things mean. Otherwise, you’re going to have – the data is going to be a mess and a poor quality and not very usable. So, again, you’ve got a contract for these things upfront and make sure you make clear definitions and say what you want.

Contracting #1

Contracting; implicitly, we’ve been talking about contracting already, but this is what a standard says. So, the request for proposal or statement of work has got to include the following. Typically we have an RFP before we’ve got a contract, so we need to have worked out what we need really early in the program or project, which isn’t always done very well. To work out what you need the customer, the purchaser, has probably got to do some analysis of their own in order to work all this stuff out. And I know I say this every time with these tasks, but it is so important. You can’t just dump everything on the contractor and expect them to produce good results because often the contractor is hamstrung. If you haven’t done your homework to help them do their work, then you’re going to get poor results and it’s not their fault.

So, we’ve got to impose the requirement for the task if we want it or need it. We’ve got to identify the functional disciplines. So, which specialists are going to do this work? Because very often the safety team are generalists. They do not have specialist technical knowledge in some of these areas. Or maybe they are not human factor specialists. We need somebody in, some human factor specialists, some user representatives, people who understand how the system will be used in real life and what the real-world constraints are. We need those stakeholders involved – That’s very important. We’ve got to identify those architectures and systems which make up the SOS -very important. The concept of operations. SOS is very much about giving capability. So, it’s all about what are you going to do with the whole thing when you put it together? How’s all that going to work?

Contracting #2

Interesting one, E, which is unique, I think, to task 209, what are the locations of the different systems and how far apart are they? We might be dealing with systems where the distance between them is so great that transmission time becomes an issue for energy or communications. Let’s say you’re bouncing a signal from an aircraft or a drone around the world via a couple of satellites back to home base. There could be a significant lag in communications. So, we need to understand all of these things because they might give rise to hazards or reduce the effectiveness of controls.

Part F; what analysis, methods, techniques do you want to use? And any special data to be used? Again, with these collections of systems that becomes more difficult to specify and more important. And then do we have any specific hazard management requirements? For example, are we using standard definitions and risk matrix from a standard or have we got our own? That all needs to be communicated.

Example #1

So, that is the totality of the task. As you can see, there’s not much to Task 209, so I thought it would be much more helpful to use an example, an illustration, and as they used to say in children’s TV, “Here’s one I made earlier” because a few years ago I had to produce a safety case report. I was the safety case report writer, and there was a small team of us generating the evidence, doing the analysis for the safety case itself.

What we were asked to do is to assure the safety of a system and – in fact, it was two systems but I just treat it as one – of a system for guiding aircraft onto ships in bad weather. So, all of these things existed beforehand. The aircraft were already in service. The ships were already in service. Some of the systems were already in service, but we were putting them together in a new combination. So, we had to take into account human factors. That was very important. We’ll see why in just a moment.

The operating environment, which was quite demanding. So, the whole point is to get the aircraft safely back to the ships in bad weather. They could do it in good weather you could do it visually, but in bad weather, visual wasn’t going to cut it. So, the operating environment- we were being asked to operate in a much more difficult environment. So, that changed everything and drove everything.

We’ve got to consider operating procedures because, as we’re about to see, people are gluing the systems together. So, how do they make it work? And also got to think about maintenance and management. Although in actual fact, we didn’t really consider maintenance and management that much. As an ex-maintainer, this annoys me, but the truth is people are much more focused on getting their capability and service. Often, they think about support as an afterthought. We’ll talk about that one day.

Example #2

Here’s a little demonstration of our system of systems. Bottom right-hand corner, we’ve got the ship with lots of people on the ship. So, if the aircraft crashes into it that could be bad news, not just for the people in the aircraft, but for the people on the ship – big risks there!

We’ve got our radar mounted on the ship so the ship is supplying the radar with power and control and data, telling it where to point for example. Also, the ship might be inadvertently interfering with the radar. There are lots of other electronic systems on the ship. There are bits of the ship getting in the way of the radar, depending on where you’ve put it, and so on and so forth. So, the ship interacts with the radar, the radar interacts with the ship, radars producing radiation. Could that be doing anything to the ship systems?

And then the radar is being operated. Now, I think that symbol is meant to indicate a DJ, but we’ve got the DJ wearing headphones and we got a disk there but it looks like a radar scope to me. So, I’ve just hijacked that. That’s the radar operator who is going to talk to the pilot and give the pilot verbal commands to guide them safely back to the ship. So, that’s how the system works.

In an ideal world, the ship would use the radar and then talk electronically direct to the aircraft and guide it – maybe automatically? That would be a much more sensible setup. In fact, that’s often the way it’s done. But in this particular case, we had to produce a bit of a – I hesitate to call it a lash-up because it was a multi-million-dollar project, but it was a bit of a lash-up.

So, there is the human factors. We’ve got a radar operator doing quite a difficult job and a pilot doing a very difficult job trying to guide their aircraft back onto the ship in bad weather. How are they going to interact and perform? And then lastly, as I alluded to earlier, the aircraft and the ship do actually interact in a limited way. But of course, it’s a physical interaction, so you can actually hurt people and of course, if we get it wrong, the aircraft interacts with the surface of the ocean, which is very bad indeed for the aircraft. So, we’ve got to be careful there. So, there’s a little illustration of our system of systems.

Example #3

And – this is the top-level argument that we came up with – it’s in goal structuring notation. But don’t worry too much about that – We’ll have a session on how to do GSN another time.

So, our top goal, or claim if you like, is that our system of systems is adequately safe for the aircraft to locate and approach the ship. So, that’s a very basic, very simple statement, but of course, the devil is in the detail and all of that detail we call the context. So, surrounding that top goal or claim, we’ve got descriptions of the system, of the aircraft and the ship. We got a definition of what we mean by adequately safe and we’ve got safety targets and reporting requirements.

So, what supports the top goal? We’ve got a strategy and after a lot of consultation and designing the safety argument, we came up with a strategy where we said, “We are going to show that all elements of the system of systems are safe and all the interactions are safe”. To do that, we had to come up with a scope and some assumptions to underpin that as well to simplify things. Again, they sit in the context, we just keep the essence of the argument down the middle.

And then underneath, we’ve got four subgoals. We aim to show that each system equipment is safe to operate, so it’s ready to be operated safely. Then each one is safe in operation so it can be operated safely with real people, etc. And then we’ve got all system safety requirements are satisfied for the whole collection of stuff and then finally that all interactions are safe. So, if we can argue all four of those, we should have covered everything. Now, I suspect if I did this again today, I might do it slightly differently. Maybe a little bit more elegantly, but that’s not the point. The point is, we came up with this and it worked.

Example #4

So, I’m going to unpack each one very briefly, just to illustrate some points. First of all, each component system is safe to operate. Each of these systems, bar one, had all been purchased already, sometimes a long time ago. They all came with their own safety targets, their own risk matrices, etc, etc. So, we had to make sure that when an individual system said, “This is what we’ve got to achieve” that that was good enough for the overall system of systems. So, we had to make sure that each system met its own safety requirements and targets and that they were valid in context.

Now, you would think that double-checking existing systems would be a foregone conclusion. In reality, we discovered that the ship’s communication system and its combat data system were not as robust as assumed. We discovered some practical issues were reported by stakeholders and we also discovered some flaws in previous analysis that had been accepted a long time ago. Now, in the end, those problems didn’t change the price of fish, as we say. It didn’t make a difference to the overall system of systems.

The frailty of the ship’s comms got sorted out and we discovered it didn’t actually matter about the combat system. So, we just assumed that the data coming out of the combat system was garbage and it didn’t make any difference. However, we did upset a few stakeholders along the way. So beware, people don’t like discovering that a system that they thought was “tickety-boo” was not as good as they thought.

Example #5

The second goal was to show that the system of systems is safe in operation. So, we looked at the actual performance. We looked at test results of the radar and then also we were very fortunate that trials of the radar on the ship with aircraft were carried out and we were able to look at those trials reports. And once again, it emerged that the system in the real world wasn’t operating quite as intended, or quite as people had assumed that it would. It wasn’t performing as well. So, that was an issue. I can’t say any more about that but these things happen.

Also, a big part of the project was we included the human element. So, as I’ve said before, we had pilots and we had radar air traffic talk-down operators. So, we brought in some human factors specialists. They captured the procedures and tasks that the pilots and the radar operators had to perform. They captured them with what’s called a Hierarchical Task Analysis, they did some analysis of the tasks and what could go wrong. Then they created a model of what the humans were doing and ran it through a simulation several thousand times. So in that way, they did some performance modelling.

Now, they couldn’t give us an absolute figure on workload or anything like that but what they could do – fortunately, our new system was replacing an older system which was even more informally cobbled together than the one that we were we were bringing in. And so, the Human Factor specialists were able to compare human performance in the old system vs. human performance with the new system. Very fortunately, we were pleased to find out that the predicted performance was far better with the new system. The new system was much easier to operate for both the pilots and the talk-down radar operators. So, that was terrific.

Example #6

So, the third one; All system of systems safety requirements are satisfied. Now, this is a bit more nebulous, this goal, but what it really came down to was when you put things together, very often you get what’s called emergent behaviour. As in things start to happen that you didn’t expect or you didn’t predict based on the individual pieces. It’s the saying, two plus two equals five. You get more out of a system – you get synergy for good or ill out when you start putting different things together.

So, does the whole thing actually work? And broadly speaking, the answer was yes, it works very well. There were some issues, a good example the old radar that they used to use to talk the planes down was a search radar so the operator could see other traffic apart from the plane they were they were guiding in. Now, the operator being able to see other things is both good and bad because on the one hand gives them improved situational awareness so they can warn off traffic if it’s a collision situation develops. But also, it’s bad because it’s a distraction for the operator. So, it could have gone either way.

So, the new radar was specialized. It focused only on the aircraft being talked down. So, the operator was blind to other traffic. So that was great in terms of decreasing operator workload and ultimately pilot workload as well. But would this increase the collision risk with other traffic? And I’ll talk about that in the summary briefly.

Example #7

And then our final goal is to show that all interactions are safe between the guidance system, the aircraft and the ship. This was a non-trivial exercise because ships have large numbers of electronic systems and there’s a very involved process to go through to check that a new piece of kit doesn’t interfere with anything else or vice versa.

And also, of course, does the new electronic system/the new radar does the radiation effect ship? Because you’ve got weapons on the ship and some of those explosive devices that the weapons uses are electrically initiated. So, could the radiation set off an explosion? So, all of those things had to be checked. And that’s a very specialized area.

And then we’ve got, does the system interfere with the aircraft and the aircraft with the system? What about the integration of the ship and the aircraft and the aircraft to the ship? Yet another specialized area where there’s a particular way of doing things. And of course, the aircraft people want to protect the aircraft and the ship people want to protect the ship. So, getting those two to marry up is also another one of those non-trivial exercises I keep referring to but it all worked out in the end.

Summary

Points to note: When we’re doing system of systems – I’ve got five points here, you can probably work some more points out from what I’ve said for yourself – but we’re putting together disparate systems. They’re different systems. They’ve been procured by different organizations, possibly, to do different things. The stakeholders who bought them and care about them have got different aims and objectives. They’ve got different agendas to each other. So, getting everyone to play nicely in the zoo can be challenging. And even with somebody pulling it all together at the top to say “This has got to work. Get with the program, folks!” there’s still some friction.

Particularly, you end up with large numbers of stakeholders. For example, we would have regular safety meetings, but I don’t think we ever had two meetings in a row with exactly the same attendees because with a large group of people, people are always changing over and things move up. And that can be a challenge in itself. We need to include the human in the loop in systems of systems because typically that’s how we get them all to play together. We rely on human beings to do a lot of translation work and in effect. So, how do the systems cope?

A classic mistake really with systems design is to design a difficult-to-operate system and then just expect the operator to cope. That can be from things as seemingly trivial as amusement park rides – I did a lesson on learning lessons from an amusement park ride accident only a month or two ago and even there it was a very complex system for two operators, neither of whom had total authority over the system or to be honest, really had the full picture of what was going on. As a result, there were several dead bodies. So, how did the operators cope, and have we done enough to support them? That’s a big issue with a system of systems.

Thirdly, this is always true with safety analysis, but especially so with system of systems. The real-world performance is important. You can do all the analysis in the world making certain assumptions and the analysis can look fine, but in the real world, it’s not so simple. We have to do analysis that assumes the kit works as advertised because you’ve got nothing else to go on until you get the test results and you don’t get them until towards the end of the program. So, you’re going down a path, assuming that things work, that they do what they say on the tin, and perhaps you then discover they don’t do what they say on the tin. Or they don’t do everything they say on a tin. Or they do what they say and they do some other things that you weren’t expecting as well and then you’ve got to deal with those issues.

And then fourthly, somewhat related to what I’ve just talked about, but you put systems together in an informal way, perhaps, and then you discover how they actually get on – what really happens. In reality, once you get above a certain level of complexity, you’re not really going to discover all the emergent behaviours and consequences until you get things into service and it’s clocked up a bit of time in service under different conditions in the real world. In fact, that was the case with this and I think with a system of systems, you’ve just got to assume that it’s sufficiently complex that that is the case.

Now, that’s not an unsolvable problem but, of course, how do you contract for that? Where you’ve got your contractors wanting you to accept their kit and pay them at a certain date or a certain point in the program, but you’re not going to find out whether it all truly works until it’s got into service and been in service for a while. So, how do you incentivize the contractor to do a good job or indeed to correct defects in a timely manner? That’s quite a challenge for system systems and it’s something that needs thinking about upfront.

And then finally, I’ve said, remember the bigger picture. It’s very easy when you’re doing analysis and you’ve made certain assumptions and you set the scope, it’s very easy to get fixated on that scope and on those assumptions and forget the real world is out there and is unpredictable. We had lots of examples of that on this program. We had the ship’s comms that didn’t always work properly, we couldn’t rely on the combat system, the radar in the real world didn’t operate as well as it said in the spec, etc, etc. There were lots of these things.

And, one example I mentioned was that with the new radar, the radar operator does not see any traffic other than the aircraft that is being guided in. So, there’s a loss of situational awareness there and there’s a risk, maybe an increased risk, of collision with other traffic. And that actually led to a disagreement in our team because some people who had got quite fixated on the analysis and didn’t like the suggestion that maybe they’d missed something. Although it was never put in those terms, that’s the way they took it. So, we need to be careful of egos. We might think we’ve done a fantastic analysis and we’ve produced hundreds of pages of data and fault trees or whatever it might be but that doesn’t mean that our analysis has captured everything or that it’s completely captured what goes on in the real world because that’s very difficult to do with such a complex system of systems.

So, we need to be aware of the bigger picture, even if it’s only just qualitatively. Somebody, a little voice, piping up somewhere saying, “What about this? And we thought about that? I know we’re ignoring this because we’ve been told to but is that the right thing to do?” And sometimes it’s good to be reminded of those things and we need to remember the big picture.

Copyright Statement

Anyway, I’ve talked for long enough. It just remains for me to point out that all the text in quotations, in italics, is from the military standard, which is copyright free but this presentation is copyright of the Safety Artisan. As I’m recording this, it’s the 5th of September 2020.

For More …

And so if you want more, please do subscribe to the Safety Artisan channel on YouTube and you can see the link there, but just search for Safety Artisan in YouTube and you’ll find us. So, subscribe there to get free video lessons and also free previews of paid content. And then for all lessons, both paid and free, and other resources on safety topics please visit the Safety Artisan at www.safetyartisan.com/  where I hope you’ll find much more good stuff that you find helpful and enjoyable.

End: System of Systems Hazard Analysis

So, that is the end of the presentation and it just remains for me to say thanks very much for watching and listening. It’s been good to spend some time with you and I look forward to talking to you next time about environmental analysis, which is Task 210 in the military standard. That’ll be next month, but until then, goodbye.

Categories
Start Here Work Health and Safety

Introduction to WHS Codes of Practice

In the 30-minute session, we introduce Australian WHS Codes of Practice (CoP). We cover: What they are and how to use them; their Limitations; we List (Federal) codes; provide Further commentary; and Where to get more information. This session is a useful prerequisite to all the other sessions on CoP.

Codes of Practice: Topics

  • What they are and how to use them;
  • Limitations;
  • List of CoP (Federal);
  • Further commentary; and
  • Where to get more information.

Codes of Practice: Transcript

Click Here for the Transcript

Hello and welcome to the Safety Artisan, where you will find professional, pragmatic, and impartial teaching and resources on all thing’s safety. I’m Simon and today is the 16th of August 2020. Welcome to the show.

Introduction

So, today we’re going to be talking about Codes of Practice. In fact, we’re going to be introducing Codes of Practice and the whole concept of what they are and what they do.

Topics for this Session

What we’re going to cover is what Codes of Practice are and how to use them – several slides on that; a brief word on their limitations; a list of federal codes of practice – and I’ll explain why I’m emphasizing it’s the list of federal ones; some further commentary and where to get more information. So, all useful stuff I hope.

CoP are Guidance

So, Codes of Practice come in the work, health and safety hierarchy below the act and regulations. So, at the top you’ve got the WHS Act, then you’ve got the WTS regulations, which the act calls up. And then you’ve got the Codes of Practice, which also the act calls up. We’ll see that in a moment. And what Codes of Practice do are they provide practical guidance on how to achieve the standards of work, health and safety required under the WHS act and regulations, and some effective ways to identify and manage risks. So, they’re guidance but as we’ll see in a moment, they’re much more than guidance. So, as I said, the Codes of Practice are called up by the act and they’re approved and signed off by the relevant minister. So, they are a legislative instrument.

Now, a quick footnote. These words, by the way, are in the introduction to every Code of Practice. There’s a little note here that says we’re required to consider all risks associated with work, not just for those risks that have associated codes of practice. So, we can’t hide behind that. We’ve got to think about everything. There are codes of practice for several things, but not everything. Not by a long way.

…Guidance We Should Follow

Now, there are three reasons why Codes of Practice are a bit more than just guidance. So, first of all, they are admissible in court proceedings. Secondly, they are evidence of what is known about a hazard, risk, risk assessment, risk control. And thirdly, courts may rely, or regulators may rely, on Codes of Practice to determine what is reasonably practicable in the circumstances to which the code applies. So, what’s the significance of that?

So first of all, the issue about being admissible. If you’re unfortunate enough to go to court and be accused of failing under WHS law, then you will be able to appeal to a Code of Practice in your defence and say, “I complied with the Code of Practice”. They are admissible in court proceedings. However, beyond that, all bets are off. It’s the court that decides what is anadmissible defence, and that means lawyers decide, not engineers. Now, given that you’re in court and the incident has already happened a lot of the engineering stuff that we do about predicting the probability of things is no longer relevant. The accident has happened. Somebody has got hurt. All these probability arguments are dust in your in the wake of the accident. So, Codes of Practice are a reliable defence.

Secondly, the bit about evidence of what is known is significant, because when we’re talking about what is reasonably practicable, the definition of reasonably practicable in Section 18 of the WHS act talks about what it is reasonable or what should have been known when people were anticipating the risk and managing it. Now, given that Codes of Practice were published back in 2012, there’s no excuse for not having read them. So, they’re pre –existing, they’re clearly relevant, the law has said that they’re admissible in court. We should have read them, and we should have acted upon them. And there’ll be no wriggling out of that. So, if we haven’t done something that CoP guided us to do, we’re going to look very vulnerable in court.  Or in the whatever court of judgment we’re up against, whether it be public opinion or trial by media or whatever it is.

And thirdly, some CoP can be used to help determine what is SOFARP. So in some circumstances, if you’re dealing with a risk that’s described a CoP, CoP is applicable. Then if you followed everything in CoP, then you might be able to claim that just doing that means that you’ve managed the risk SFARP. Why is that important? Because the only way we are legally allowed to expose people to risk is if we have eliminated or minimized that risk so far as is reasonably practicable, SFARP. That is the key test, the acid test, of “Have we met our risk management obligations? “And CoP are useful, maybe crucial, in two different ways for determining what is SFARP. So yes, they’re guidance but it’s guidance that we ignore at our peril.

Standards & Good Practice

So, moving on. Codes of Practice recognize, and I reemphasize this is in the introduction to every code of practice, they’re not the only way of doing things. There isn’t a CoP for everything under the sun. So, codes recognize that you can achieve compliance with WHS obligations by using another method as long as it provides an equivalent or higher standard of work, health and safety than the code. It’s important to recognize that Codes of Practice are basic. They apply to every business and undertaking in Australia potentially. So, if you’re doing something more sophisticated, then probably CoP on their own are not enough. They’re not good enough.

And in my day job as a consultant, that’s the kind of stuff we do. We do planes, trains and automobiles. We do ships and submarines. We do nuclear. We do infrastructure. We do all kinds of complex stuff for which there are standards and recognized good practice which go way beyond the requirements of basic Codes of Practice. And many I would say, probably most, technical and industry safety standards and practices are more demanding than Codes of Practice. So, if you’re following an industry or technical standard that says “Here’s a risk management process”, then it’s likely that that will be far more detailed than the requirements that are in Codes of Practice.

And just a little note to say that for those of us who love numbers and quantitative safety analysis, what this statement about equivalent or higher standards of health and safety is talking about  –We want requirements that are more demanding and more rigorous or more detailed than CoP. Not that the end –result in the predicted probability of something happening is better than what you would get with CoP because nobody knows what you would get with CoP. That calculation hasn’t been done. So, don’t go down the rabbit hole of thinking “I’ve got a quantitatively demonstrate that what we’re doing is better than CoP.” You haven’t. It’s all about demonstrating the input requirements are more demanding rather than the output because that’s never been done for CoP. So, you’ve got no benchmark to measure against in output terms.

The primacy of WHS & Regulations

A quick point to note that Codes of Practice, they are only guidance. They do refer to relevant WHS act and regulations, the hard obligations, and we should not be relying solely on codes in place of what it says in the WHS Act or the regulations. So, we need to remember that codes are not a substitute for the act or the regs. Rather they are a useful introduction. WHS ACT and regulations are actually surprisingly clear and easy to read. But even so, there are 600 regulations. There are hundreds of sections of the WHS act. It’s a big read and not all of it is going to be relevant to every business, by a long way. So, if you see a CoP that clearly applies to something that you’re doing, start with the cop. It will lead you into the relevant parts of WHS act and regulations. If you don’t know them, have a read around in there around the stuff that – you’ve been given the pointer in the CoP, follow it up.

But also, CoP do represent a minimum level of knowledge that you should have. Again, start with CoP, don’t stop with them. So, go on a bit. Look at the authoritative information in the act and the regs and then see if there’s anything else that you need to do or need to consider. The CoP will get you started.

And then finally, it’s a reference for determining SOFARP. You won’t see anything other than the definition of reasonably practicable in the Act. You won’t see any practical guidance in the Act or the regulations on how to achieve SOFARP. Whereas CoP does give you a narrative that you can follow and understand and maybe even paraphrase if you need to in some safety documentation. So, they are useful for that. There’s also guidance on reasonably practicable, but we’ll come to that at the end.

Detailed Requirements

It’s worth mentioning that there are some detailed requirements in codes. Now, when I did this, I think I was looking at the risk management Code of Practice, which will go through later in another session. But in this example, there are this many requirements. So, every CoP has the statement “The words ‘must’, ‘requires’, or ‘mandatory’ indicate a legal requirement exists that must be complied with.” So, if you see ‘must’, ‘requires’, or ‘mandatory’, you’ve got to do it. And in this example CoP that I was looking at, there are 35 ‘must’s, 39 ‘required’ or ‘requirement’ – that kind of wording – and three instances of ‘mandatory’. Now, bearing in mind the sentence that introduces those things contains two instances of ‘must’ and one of ‘requires’ and one of ‘mandatory’. So, straight away you can ignore those four instances. But clearly, there are lots of instances here of ‘must’ and ‘require’ and a couple of ‘mandatory’.

Then we’ve got the word ‘should’ is used in this code to indicate a recommended course of action, while ‘may’ is used to indicate an optional course of action. So, the way I would suggest interpreting that and this is just my personal opinion – I have never seen any good guidance on this. If it says ‘recommended’, then personally I would do it unless I can justify there’s a good reason for not doing it. And if it said ‘optional’, then I would consider it. But I might discard it if I felt it wasn’t helpful or I felt there was a better way to do it. So, that would be my personal interpretation of how to approach those words. So, ‘recommended’ – do it unless you can justify not doing it. ‘Optional’ – Consider it, but you don’t have to do it.

And in this particular one, we’ve got 43 instances of ‘should’ and 82 of ‘may’. So, there’s a lot of detailed information in each CoP in order to consider. So, read them carefully and comply with them where you have to work and that will repay you. So, a positive way to look at it, CoP are there to help you. They’re there to make life easy for you. Read them, follow them. The negative way to look at them is, ”I don’t need to do all this says in CoP because it’s only guidance”. You can have that attitude if you want. If you’re in the dock or in the witness box in court, that’s not going to be a good look. Let’s move on.

Limitations of CoP

So, I’ve talked CoP up quite a lot; as you can tell, I’m a fan because I like anything that helps us do the job, but they do have limitations. I’ve said before that there’s a limited number of them and they’re pretty basic. First of all, it’s worth noting that there are two really generic Codes of Practice. First of all, there’s the one on risk management. And then secondly, there’s the one on communication, consultation and cooperation. And I’ll be doing sessions on both of those. Now, those apply to pretty much everything we do in the safety world. So, it’s essential that you read them no matter what you’re doing and comply with them where you have to.

Then there are other codes of practice that apply to specific activities or hazards, and some of them are very, very specific, like getting rid of asbestos, or welding, or spray painting – or whatever it might be – shock blasting. Those have clearly got a very narrow focus. So, you will know if you’re doing that stuff. So, if you are doing welding and clearly you need to read the welding CoP. If welding isn’t part of your business or undertaking, you can forget it.

However, overall, there are less than 25 Codes of Practice. I can’t be more precise for reasons that we will come to in a moment. So, there’s a relatively small number of CoP and they don’t cover complex things. They’re not going to help you design a super –duper widget or some software or anything like that. It’s not going to help you do anything complicated. Also, Codes of Practice tend to focus on the workplace, which is understandable. They’re not much help when it comes to design trade –offs. They’re great for the sort of foundational stuff. Yes, we have to do all of this stuff regardless. When you get to questions of, “How much is enough?” Sometimes in safety, we say, “How much margin do I need?” “How many layers of protection do I need?” “Have I done enough?” CoP aren’t going to be a lot of use helping you with that kind of determination but you do need to have made sure you’ve done everything CoP first and then start thinking about those trade –offs, would be my advice. You’re less likely to go wrong that way. So, start with your firm basis of what you have to do to comply and then think “What else could I do?”

List of CoP (Federal) #1

Now for information, you’ve got three slides here where we’ve got a list of the Codes of Practice that apply at the federal or Commonwealth level of government in Australia. So, at the top highlighted I’ve already mentioned the ‘how’ to manage WHS risks and the consultation, cooperation, and coordination codes. Then we get into stuff like abrasive, blasting, confined spaces, construction and demolition and excavation, first aid. So, quite a range of stuff, covered.

List of CoP (Federal) #2

Hazardous manual tasks – so basically human beings carrying and moving stuff. Managing and controlling asbestos, and removing it. Then we’ve got a couple on hazardous chemicals on this page, electrical risks, managing noise, preventing hearing loss, and stevedoring. There you go. So, if you’re into stevedoring, then this CoP is for you. The highlighted ones we’re going to cover in later sessions.

List of CoP (Federal) #3

Then we’ve got managing risk of Plant in the workplace. There was going to be a Code of Practice for the design of Plant, but that never saw the light of day so we’ve only got guidance on that. We’ve got falls, environment, work environment, and facilities. We’ve got another one on safety data sheets for another one on hazardous chemicals, preventing falls in housing – I guess because that’s very common accident – safe design of structures, spray painting and powder coating, and welding processes. So, those are the list of – I think it’s 24 – Codes of Practice are applied by Comcare, the federal regulator.

Commentary #1

Now, I’m being explicit about which regulator and which set of CoP, because they vary around Australia. Basically, the background was the model Codes of Practice were developed by Safe Work Australia, which is a national body. But those model Codes of Practice do not apply. Safe Work Australia is not a regulator. Codes of Practice are implemented or enforced by the federal government and by most states and territories. And it says with variations for a reason. Not all states and territories impose all codes of practice. For example, I live in South Australia and if you go and look at the WorkSafe South Australia website or Safe Work – whatever it’s called – you will see that there’s a couple of CoP that for some reason we don’t enforce in South Australia. Why? I do not know. But you do need to think about these things depending on where you’re operating.

It’s also worth saying that WHS is not implemented in every state in Australia. Western Australia currently have plans to implement WHS, but as of 2020 but I don’t believe they’ve done so yet. Hopefully, it’s coming soon. And Victoria, for some unknown reason, have decided they’re just not going to play ball with everybody else. They’ve got no plans to implement WHS that I can find online. They’re still using their old OHS legislation. It’s not a universal picture in Australia, thanks to our rather silly version of government that we have here in Australia – forget I said that. So, if it’s a Commonwealth workplace and we apply the federal version of WHS and Codes of Practice. Otherwise, we use state or territory versions and you need to see the local regulator’s Web page to find out what is applied where. And the definition of a Commonwealth workplace is in the WHS Act, but also go and have a look at the Comcare website to see who Comcare police. Because there are some nationalised industries that count as a Commonwealth workplace and it can get a bit messy.

So, sometimes you may have to ask for advice from the regulator but go and see what they say. Don’t rely on what consultants say or what you’ve heard on the grapevine. Go and see what the regulator actually says and make sure it’s the right regulator for where you’re operating.

Commentary #2

What’s to come? I’m going to do a session on the Risk Management Code of Practice, and I’m also, associated with that, going to do a session on the guidance on what is reasonably practicable. Now that’s guidance, it’s not a Code of Practice. But again, it’s been published so we need to be aware of it and it’s also very simple and very helpful. I would strongly recommend looking at that guidance if you’re struggling with SFARP for what it means, it’s very good. I’ll be talking about that soon. Also, I’m going to do a session on tolerability of risk, because you remember when I said “CoP aren’t much good for helping you do trade–offs in design” and that kind of thing. They’re really only good for simple stuff and compliance. Well, what you need to understand to deal with the more sophisticated problems is the concept of tolerability of risk. That’ll help us do those things. So, I’m going to do a session on that.

I’m also going to do a session on consultation, cooperation, and coordination, because, as I said before, that’s universally applicable. If we’re doing anything at a workplace, or with stuff that’s going to a workplace, that we need to be aware of what’s in that code. And then I’m also going to do sessions on plant, structures and substances (or hazardous chemicals) because those are the absolute bread and butter of the WHS Act. If you look at the duties of designers, manufacturers, importers, suppliers, and installers, et cetera, you will find requirements on plant, substances and structures all the way through those clauses in the WHS Act. Those three things are key so we’re going to be talking about that.

Now, I mentioned before that there was going to be a Code of Practice on plant design, but it never made it. It’s just guidance. So, we’ll have a look at that if we can as well – Copyright permitting. And then I want to look at electrical risks because I think the electrical risks code is very useful. Both for electrical risks, but it’s also a useful teaching vehicle for designers and manufacturers to understand their obligations, especially if you operate abroad and you want to know, or if you’re importing stuff “Well, how do I know that my kit can be safely used in Australia?” So, if you can’t do the things that the electrical risk CoP requires in the workplace if your piece of kit won’t support that, then it’s going to be difficult for your customers to comply. So, probably there’s a hint there that if you want to sell your stuff successfully, here’s what you need to be aware of. And then that applies not just to electrical, I think it’s a good vehicle for understanding how CoP can help us with our upstream obligations, even though CoP applies to a workplace. That session will really be about the imaginative use of Code of Practice in order to help designers and manufacturers, etc.

And then I want to also talk about noise Code of Practice, because noise brings in the concept of exposure standards. Now, generally, Codes of Practice don’t quote many standards. They’re certainly not mandatory, but noise is one of those areas where you have to have standards to say, “this is how we’re going to measure the noise”. This is the exposure standard. So, you’re not allowed to expose people to more than this. That brings in some very important concepts about health monitoring and exposure to certain things. Again, it’ll be useful if you’re managing noise but I think that session will be useful to anybody who wants to understand how exposure standards work and the requirements for monitoring exposure of workers to certain things. Not just noise, but chemicals as well. We will be covering a lot of that in the session(s) on HAZCHEM.

Copyright & Attribution

I just want to mention that everything in quotes/in italics is downloaded from the Federal Register of Legislation, and I’ve gone to the federal legislation because I’m allowed to reproduce it under the license, under which it’s published. So, the middle paragraph there – I’m required to point that out that I sourced it from the Federal Register of legislation, the website on that date. And for the latest information, you should always go to the website to double–check that the version that you’re looking at is still in force and is still relevant. And then for more information on the terms of the license, you can go and see my page at the www.SafetyArtisan.com because I go through everything that’s required and you can check for yourself in detail.

For More…

Also, on the website, there’s a lot more lessons and resources, some of them free, some of them you have to pay to access, but they’re all there at www.safetyartisan.com. Also, there’s the Safety Artisan page at www.patreon.com/SafetyArtisan where you will see the paid videos. And also, I’ve got a channel on YouTube where the free videos are all there. So, please go to the Safety Artisan channel on YouTube and subscribe and you will automatically get a notification when a new free video pops up.

End

And that brings me to the end of the presentation, so thanks very much for listening. I’m just going to stop sharing that now. It just remains for me to say thank you very much for tuning in and I look forward to sharing some more useful information on Codes of Practice with you in the next session in about a month’s time. Cheers now, everybody. Goodbye.

There’s more!

You can find the Model WHS Codes of Practice here. Back to the Topics Page.

Categories
Human Factors System Safety

Introduction to Human Factors

In this 40-minute video, ‘Introduction to Human Factors’, I am very pleased to welcome Peter Benda to The Safety Artisan.

Peter is a colleague and Human Factors specialist, who has 23 years’ experience in applying Human Factors to large projects in all kinds of domains. In this session we look at some fundamentals: what does Human Factors engineering aim to achieve? Why do it? And what sort of tools and techniques are useful? As this is The Safety Artisan, we also discuss some real-world examples of how erroneous human actions can contribute to accidents, and how Human Factors discipline can help to prevent them.

Topics: Introduction to Human Factors

  • Introducing Peter;
  • The Joint Optimization Of Human-Machine Systems;
  • So why do it (HF)?
  • Introduction to Human Factors;
  • Definitions of Human Factors;
  • The Long Arm of Human Factors;
  • What is Human Factors Integration? and
  • More HF sessions to come…

Transcript: Introduction to Human Factors

Click Here for the Transcript

Transcript: Intro to Human Factors

Introduction

Simon:  Hello, everyone, and welcome to the Safety Artisan: Home of Safety Engineering Training. I’m Simon and I’m your host, as always. But today we are going to be joined by a guest, a Human Factors specialist, a colleague, and a friend of mine called Peter Benda. Now, Peter started as one of us, an ordinary engineer, but unusually, perhaps for an engineer, he decided he didn’t like engineering without people in it. He liked the social aspects and the human aspects and so he began to specialize in that area. And today, after twenty-three years in the business, and first degree and a master’s degree in engineering with a Human Factors speciality. He’s going to join us and share his expertise with us.

So that’s how you got into it then, Peter. For those of us who aren’t really familiar with Human Factors, how would you describe it to a beginner?

Peter:   Well, I would say it’s The Joint Optimization Of Human-Machine Systems. So it’s really focusing on designing systems, perhaps help holistically would be a term that could be used, where we’re looking at optimizing the human element as well as the machine element. And the interaction between the two. So that’s really the key to Human Factors. And, of course, there are many dimensions from there; environmental, organizational, job factors, human and individual characteristics. All of these influence behaviour at work and health and safety. Another way to think about it is the application of scientific information concerning humans to the design of systems. Systems are for human use, which I think most systems are.

Simon:  Indeed. Otherwise, why would humans build them?

Peter:   That’s right. Generally speaking, sure.

Simon:  So, given that this is a thing that people do then. Perhaps we’re not so good at including the human unless we think about it specifically?

Peter:   I think that’s fairly accurate. I would say that if you look across industries, and industries are perhaps better at integrating Human Factors, considerations or Human Factors into the design lifecycle, that they have had to do so because of the accidents that have occurred in the past. You could probably say this about safety engineering as well, right?

Simon:  And this is true, yes.

Peter:   In a sense, you do it because you have to because the implications of not doing it are quite significant. However, I would say the upshot, if you look at some of the evidence –and you see this also across software design and non-safety critical industries or systems –that taking into account human considerations early in the design process typically ends up in better system performance. You might have more usable systems, for example. Apple would be an example of a company that puts a lot of focus into human-computer interaction and optimizing the interface between humans and their technologies and ensuring that you can walk up and use it fairly easily. Now as time goes on, one can argue how out how well Apple is doing something like that, but they were certainly very well known for taking that approach.

Simon:  And reaped the benefits accordingly and became, I think, they were the world’s number one company for a while.

Peter:   That’s right. That’s right.

Simon:  So, thinking about the, “So why do it?” What is one of the benefits of doing Human Factors well?

Peter:   Multiple benefits, I would say. Clearly, safety and safety-critical systems, like health and safety; Performance, so system performance; Efficiency and so forth. Job satisfaction and that has repercussions that go back into, broadly speaking, that society. If you have meaningful work that has other repercussions and that’s sort of the angle I originally came into all of this from. But, you know, you could be looking at just the safety and efficiency aspects.

Simon:  You mentioned meaningful work: is that what attracted you to it?

Peter:   Absolutely. Absolutely. Yes. Yes, like I said I had a keen interest in the sociology of work and looking at work organization. Then, for my master’s degree, I looked at lean production, which is the Toyota approach to producing vehicles. I looked at multiskilled teams and multiskilling and job satisfaction. Then looking at stress indicators and so forth versus mass production systems. So that’s really the angle I came into this. If you look at it, mass production lines where a person is doing the same job over and over, it’s quite repetitive and very narrow, versus the more Japanese style lean production. There are certainly repercussions, both socially and individually, from a psychological health perspective.

Simon:  So, you get happy workers and more contented workers-

Peter:   –And better quality, yeah.

Simon:  And again, you mentioned Toyota. Another giant company that’s presumably grown partly through applying these principles.

Peter:   Well, they’re famous for quality, aren’t they? Famous for reliable, high-quality cars that go on forever. I mean, when I moved from Canada to Australia, Toyota has a very, very strong history here with the Land Cruiser, and the high locks, and so forth.

Simon:  All very well-known brands here. Household names.

Peter:   Are known to be bombproof and can outlast any other vehicle. And the lean production system certainly has, I would say, quite a bit of responsibility for the production of these high-quality cars.

Simon:  So, we’ve spoken about how you got into it and “What is it?” and “Why do it?” I suppose, as we’ve said, what it is in very general terms but I suspect a lot of people listening will want to know to define what it is, what Human Factors is, based on doing it. On how you do it. It’s a long, long time since I did my Human Factors training. Just one module in my masters, so could you take me through what Human Factors involves these days in broad terms.

Peter:   Sure, I actually have a few slides that might be useful –  

Simon:  – Oh terrific! –

Peter:   –maybe I should present that. So, let me see how well I can share this. And of course, sometimes the problem is I’ll make sure that – maybe screen two is the best way to share it. Can you see that OK?

Simon:  Yeah, that’s great.

Introduction to Human Factors

Peter:   Intro to Human Factor. So, as Stewart Dickinson, who I work with at human risk solutions and I have prepared some material for some courses we taught to industry. I’ve some other material and I’ll just flip to some of the key slides going through “What is Human Factors”. So, let me try to get this working and I’ll just flip through quickly.

Definitions of Human Factors

Peter:   So, as I’ve mentioned already, broadly speaking, environmental, organizational, and job factors, and human individual characteristics which influence behaviour at work in a way that can which can affect health and safety. That’s a focus of Human Factors. Or the application of scientific information concerning humans to the design of objects, systems and environments for human use. You see a pattern here, fitting work to the worker. The term ergonomics is used interchangeably with Human Factors. It also depends on the country you learn this in or applied in.

Simon:  Yes. In the U.K., I would be used to using the term ergonomics to describe something much narrower than Human Factors but in Australia, we seem to use the two terms as though they are the same.

Peter:   It does vary. You can say physical ergonomics and I think that would typically represent when people think of ergonomics, they think of the workstation design. So, sitting at their desk, heights of tables or desks, and reach, and so on. And particularly given the COVID situation, there are so many people sitting at their desks are probably getting some repetitive strain –

Simon:  –As we are now in our COVID 19 [wo]man caves.

Peter:   That’s right! So that’s certainly an aspect of Human Factors work because that’s looking at the interaction between the human and the desk/workstation system, so to speak, on a very physical level.        

            But of course, you have cognitive ergonomics as well, which looks of perceptual and cognitive aspects of that work. So Human Factors or ergonomics, broadly speaking, would be looking at these multi-dimensional facets of human interaction with systems.

Definitions of Human Factors (2)

Peter:   Some other examples might be the application of knowledge of human capabilities and limitations to design, operation and maintenance of technological systems, and I’ve got a little distilled –or summarized- bit on the right here. The Human Factors apply scientific knowledge to the development and management of the interfaces between humans and rail systems. So, this is obviously in the rail context so you’re, broadly speaking, talking in terms of technological systems. That covers all of the people issues. We need to consider to assure safe and effective systems or organizations.

Again, this is very broad. Engineers often don’t like these broad topics or broad approaches. I’m an engineer, I learned this through engineering which is a bit different than how some people get into Human Factors.

Simon:  Yeah, I’ve met a lot of human factor specialists who come in from a first degree in psychology.

Peter:   That’s right. I’d say that’s fairly common, particularly in Australia and the UK. Although, I know that you could take it here in Australia in some of the engineering schools, but it’s fairly rare. There’s an aviation Human Factors program, I think, at Swinburne University. They used to teach it through mechanical engineering there as well. I did a bit of teaching into that and I’m not across all of the universities in Australia, but there are a few. I think the University of the Sunshine Coast has quite a significant group at the moment that’s come from, or, had some connection to Monash before that. Well, I think about, when I’m doing this work, of “What existing evidence do we have?” Or existing knowledge base with respect to the human interactions with the system. For example, working with a rail transport operator, they will already have a history of incidents or history of issues and we’d be looking to improve perhaps performance or reduce the risk associated with the use of certain systems. Really focusing on some of the evidence that exists either already in the organization or that’s out there in the public domain, through research papers and studies and accident analyses and so forth. I think much like safety engineering, there would be some or quite a few similarities in terms of the evidence base –

Simon:  – Indeed.

Peter:   – Or creating that evidence through analysis. So, using some analytical techniques, various Human Factors methods and that’s where Human Factors sort of comes into its own. It’s a suite of methods that are very different from what you would find in other disciplines.

Simon:  Sure, sure. So, can you give us an overview of these methods, Peter?

Peter:   There are trying to think of a slide for this. Hopefully, I do.

Simon:  Oh, sorry. Have I taken you out of sequence?

Peter:   No, no. Not out of sequence. Let me just flip through, and take a look at –

The Long Arm of Human Factors

Peter:   This is probably a good sort of overview of the span of Human Factors, and then we can talk about the sorts of methods that are used for each of these – let’s call them –dimensions. So, we have what’s called the long arm of Human Factors. It’s a large range of activities from the very sort of, as we’re talking about, physical ergonomics, e.g. sitting at a desk and so on, manual handling, workplace design, and moving to interface design with respect to human-machine interfaces- HMIs, as they’re called, or user interfaces. There are techniques, manual handling techniques and analysis techniques – You might be using something like a task analysis combined with a NIOSH lifting equation and so on. Workplace design, you’d be looking at anthropocentric data. So, you would have a dataset that’s hopefully representative of the population you’re designing for, and you may have quite specific populations. So Human Factors, engineering is fairly extensively used, I would say, in military projects –in the military context-

Simon:  – Yes.

Peter:   – And there’s this set of standards, the Mil standard, 1472G, for example, from the United States. It’s a great example that gives not only manual handling standards or guidelines, workplace design guidelines in the workplace, in a military sense, can be a vehicle or on a ship and so on. Or on a base and so forth.

Interface design- So, if you’re looking at from a methods perspective, you might have usability evaluations, for example. You might do workload’s studies and so forth, looking at how well the interface supports particular tasks or achieving certain goals.

            Human error –There are human error methods that typically leverage off of task models. So, you’d have a task model and you would look at for that particular task, what sorts of errors could occur and the structured methods for that?

Simon:  Yes, I remember human task analysis –seeing colleagues use that on a project I was working on. It seemed quite powerful for capturing these things.

Peter:   It is and you have to pragmatically choose the level of analysis because you could go down to a very granular level of detail. But that may not be useful, depending on the sort of system design you’re doing, the amount of money you have, and how critical the task is. So, you might have a significantly safety-critical task, and that might need quite a detailed analysis. An example there would be – there was a … I think it’s the … You can look up the accident analysis online, I believe it’s the Virgin Galactic test flight. So this is one of these test flights in the U.S. – I have somewhere in my archive of accident analyses – where the FAA had approved the test flights to go ahead and there was a task where – I hope I don’t get this completely wrong – where one of the pilots (there are two pilots, a pilot and a co-pilot) and this test aeroplane where they had to go into high-altitude in this near-space vehicle. They were moving at quite a high speed and there was a particular task where they had to do something with – I think they had to slow down and then you could … slow down their aeroplane, I guess, by reducing the throttle and then at a certain point/a certain speed, you could deploy, or control, the ailerons or some such, wing-based device, and the task order was very important. And what had happened was a pilot or the co-pilot had performed the task slightly out of order. As a matter of doing one thing first before they did another thing that led to the plane breaking up. And fortunately, one of the pilots survived, unfortunately, one didn’t.

Simon:  So, very severe results from making a relatively small mistake.

Peter:   So that’s a task order error, which is very easy to do. And if the system had been designed in a way to prevent that sort of capability to execute that action at that point. That would have been a safer design. At that level, you might be going down to that level of analysis and kind of you get called keystroke level analysis and so on

Simon:  – Where it’s justified, yes.

Peter:   Task analysis is, I think, probably one of the most common tools used. You also have workload analysis, so looking at, for example, interface design. I know some of the projects we were working on together, Simon, workload was a consideration. There are different ways to measure workload. There’s a NASA TLX, which is a subjective workload. Questionnaire essentially, that’s done post-task but it’s been shown to be quite reliable and valid as well. So, that instrument is used and there are a few others that are used. It depends on the sort of study you’re doing, the amount of time you have and so forth. Let me think, that’s workload analysis.

Safety culture- I wouldn’t say that’s my forte. I’ve done a bit of work on safety culture, but that’s more organizational and the methods there tend to be more around culpability models and implementing those into the organizational culture.

Simon:  So, more governance type issues? That type of thing?

Peter:   Yes. Governance and – whoops! Sorry, I didn’t mean to do that. I’m just looking at the systems and procedure design. The ‘e’ is white so it looks like it’s a misspelling there. So it’s annoying me …

Simon:  – No problem!

Peter:   Yes. So, there are models I’ve worked with at organization such as some rail organizations where they look at governance, but also in terms of appropriate interventions. So, if there’s an incident, what sort of intervention is appropriate? So, essentially use sort of a model of culpability and human error and then overlay that or use that as a lens upon which to analyse the incident. Then appropriately either train employees or management and so on. Or perhaps it was a form of violation, a willful violation, as it may be –

Simon:  – Of procedure?

Peter:   Yeah, of procedure and so on versus a human error that was encouraged by the system’s design. So, you shouldn’t be punishing, let’s say, a train driver for a SPAD if the –

Simon:  – Sorry, that’s a Signal Passed At Danger, isn’t it?

Peter:   That’s right. Signal Passed At Danger. So, it’s certainly possible that the way the signalling is set up leads to a higher chance of human error. You might have multiple signals at a location and it’s confusing to figure out which one to attend to and you may misread and then you end up SPADing and so on. So, there are, for example, clusters of SPADs that will be analysed and then the appropriate analysis will be done. And you wouldn’t want to be punishing drivers if it seemed to be a systems design issue.

Simon:  Yes. I saw a vivid illustration of that on the news, I think, last night. There was a news article where there was an air crash that tragically killed three people a few months ago here in South Australia. And the newsies report today is saying it was human error but when they actually got in to reporting what had happened, it was pointed out that the pilot being tested was doing – It was a twin-engine aeroplane and they were doing an engine failure after take-off drill. And the accident report said that the procedure that they were using allowed them to do that engine failure drill at too low an altitude. So, if the pilot failed to take the correct action very quickly – bearing in mind this is a pilot being tested because they are undergoing training – there was no time to recover. So, therefore, the aircraft crashed. So, I thought, ”Well, it’s a little bit unfair just to say it’s a human error when they were doing something that was in intrinsically inappropriate for a person of that skill level.”

Peter:   That’s an excellent example and you hear this in the news a lot. Human error, human error and human error. The cause of this, I think, with the recent Boeing problems with the flight control system for the new 737s. And of course, there will be reports. Some of the interim reports already talk about some of these Human Factors, issues inherent in that, and I would encourage people to look up the publicly available documentation on that-

Simon:  – This is the Boeing 737 Max accidents in Indonesia and in Ethiopia, I think.

Peter:   That’s correct. That’s correct. Yes, absolutely. And pilot error was used as the general explanation but under further analysis, you started looking at that error. That so to speak error perhaps has other causes which are systems design causes, perhaps. So these things are being investigated but have been written about quite extensively. And you can look at, of course, any number of aeroplane accidents and so on. There’s a famous Air France one flying from Brazil to Paris, from what I recall. It might have been Rio de Janeiro to Paris. Where the pitot –

Simon:  – Yeah, pitot probes got iced up.

Peter:    Probes, they iced up and it was dark. So, the pilots didn’t have any ability to gauge by looking outside. I believe it was dark or it might have been a storm. There’s some difficulty in engaging what was going on outside of the aeroplane and there again misreads. So, stall alarms going off and so off, I believe. There were some mis-readings on the airspeed coming from the sensors, essentially. And then the pilots acted according to that information, but that information was incorrect. So, you could say there were probably a cascade of issues that occurred there and there’s a fairly good analysis one can look up that looks at the design. I believe it was an Airbus. It was the design of the Airbus. So, we had one pilot providing an input in one direction to the control yoke and the other pilot in the other direction. There are a number of things that broke down. And typically, you’ll see this in accidents. You’ll have a cascade as they’re trying to troubleshoot and can’t figure out what’s going on they’ll start applying various approaches to try and remedy the situation and people begin to panic and so on.

            And you have training techniques, a crew resource management, which certainly has a strong Human Factors element or comes out of the Human Factors world, which looks at how to have teams and cockpits. And in other situations working effectively in emergency situations And that’s sort of after analysing, of course, failures.

Simon:  Yes, and I think CRM, crew resource management, has been adopted not just in the airline industry, but in many other places as well, hasn’t it?

Peter:   Operating theatres, for example. There’s quite a bit of work in the 90s that started with I think it was David Gaba who I think was at Stanford – this is all from memory. That then look at operating theatres. In fact, the Monash Medical Centre in Clayton had a simulation centre for operating theatres where they were applying these techniques to training operating theatre personnel. So, surgeons, anaesthetists, nurses and so forth.

Simon:  Well, thanks, Peter. I think and I’m sorry, I think I hijacked you’ll the presentation, but –

Peter:   It’s not really a presentation anyway. It was more a sort of better guidance there. We’re talking about methods, weren’t we? And it’s easy to go then from methods to talking about accidents. Because then we talk about the application of some of these methods or if these methods are applied to prevent accidents from occurring.

Simon:  Cool. Well, thanks very much, Peter. I think maybe I’ll let the next time we have a chat I’ll let you talk through your slides and we’ll have a more in-depth look across the whole breadth of Human Factors.

Peter:   So that’s probably a good little intro at the moment anyway. Perhaps I might pull up one slide on Human Factors integration before we end.

Simon:  Of course.

Peter:   I’ll go back a few slides here.

What is Human Factors Integration?

Peter:   And so what is Human Factors integration? I was thinking about this quite a bit recently because I’m working on some complex projects that are very, well, not only complex but quite large engineering projects with lots of people, lots of different groups involved, different contracts and so forth. And the integration issues that occur. They’re not only Human Factors integration issues there are larger-scale integration issues, engineering integration issues. Generally speaking, this is something I think that projects often struggle with. And I was really thinking about the Human Factors angle and Human Factors integration. That’s about ensuring that all of the HF issues, so HF in Human Factors, in a project are considered in control throughout the project and deliver the desired performance and safety improvements. So, three functions of Human Factors integration

  • confirm the intendant system performance objectives and criteria
  • guide and manage the Human Factors, aspects and design cycles so that negative aspects don’t arise and prevent the system reaching its optimum performance level
  • and identify and evaluate any additional Human Factors safety aspect now or we found in the safety case.

You’ll find, particularly in these complex projects, that the interfaces between the –  you might have quite a large project and have some projects working on particular components. Let’s say one is working on more of a civil/structural elements and maybe space provisioning and so on versus another one is working more on control systems. And the integration between those becomes quite difficult because you don’t really have that Human Factors integration function working to integrate those two large components. Typically, it’s within those focused project groupings –that’s the way to call them. Does that make sense?

Simon:  Yeah. Yeah, absolutely.

Peter:   I think that’s one of the big challenges that I’m seeing at the moment, is where you have a certain amount of time and money and resource. This would be common for other engineering disciplines and the integration work often falls by the wayside, I think. And that’s where I think a number of the ongoing Human Factors issues are going to be cropping up some of these large-scale projects for the next 10 to 20 years. Both operationally and perhaps safety as well. Of course, we want to avoid –

Simon:  –Yes. I mean, what you’re describing sounds very familiar to me as a safety engineer and I suspect to a lot of engineers of all disciplines who work on large projects. They’re going to recognize that as it is a familiar problem.

Peter:   Sure. You can think about if you’ve got the civil and space provisioning sort of aspect of a project and another group is doing what goes into, let’s say, a room into a control room or into a maintenance room and so on. It may be that things are constrained in such a way that the design of the racks in the room has to be done in a way that makes the work more difficult for maintainers. And it’s hard to optimize these things because these are complex projects and complex considerations. And a lot of people are involved in them. The nature of engineering work is typically to break things down into little elements, optimize those elements and bring them all together.

Simon:  –Yes.

Peter:   Human Factors tends to –Well, you can do them Human Factors as well but I would argue that certainly what attracted me to it, is that you tend to have to take a more holistic approach to human behaviour and performance in a system.

Simon:  Absolutely.

Peter:   Which is hard.

Simon:   Yes, but rewarding. And on that note, thanks very much, Peter. That’s been terrific. Very helpful. And I look forward to our next chat.

Peter:   For sure. Me too. Okay, thanks!

Simon:  Cheers!

Outro

Simon:  Well, that was our first chat with Peter on the Safety Artisan and I’m looking forward to many more. So, it just remains for me to say thanks very much for watching and supporting the work of what we’re doing and what we’re trying to achieve. I look forward to seeing you all next time. Okay, goodbye.

End: Introduction to Human Factors

Categories
Human Factors

Transcript: Intro to Human Factors

This post is the Transcript: Intro to Human Factors.

In the 40-minute video, I’m joined by a friend, colleague and Human Factors specialist, Peter Benda. Peter has 23 years of experience in applying Human Factors to large projects in all kinds of domains. In this session we look at some fundamentals: what does Human Factors engineering aim to achieve? Why do it? And what sort of tools and techniques are useful? As this is The Safety Artisan, we also discuss some real-world examples of how Human Factors can contribute to accidents or help to prevent them.

Transcript: Intro to Human Factors

Introduction

Simon:  Hello, everyone, and welcome to the Safety Artisan: Home of Safety Engineering Training. I’m Simon and I’m your host, as always. But today we are going to be joined by a guest, a Human Factors specialist, a colleague, and a friend of mine called Peter Benda. Now, Peter started as one of us, an ordinary engineer, but unusually, perhaps for an engineer, he decided he didn’t like engineering without people in it. He liked the social aspects and the human aspects and so he began to specialize in that area. And today, after twenty-three years in the business, and first degree and a master’s degree in engineering with a Human Factors speciality. He’s going to join us and share his expertise with us.

So that’s how you got into it then, Peter. For those of us who aren’t really familiar with Human Factors, how would you describe it to a beginner?

Peter:   Well, I would say it’s The Joint Optimization Of Human-Machine Systems. So it’s really focusing on designing systems, perhaps help holistically would be a term that could be used, where we’re looking at optimizing the human element as well as the machine element. And the interaction between the two. So that’s really the key to Human Factors. And, of course, there are many dimensions from there; environmental, organizational, job factors, human and individual characteristics. All of these influence behaviour at work and health and safety. Another way to think about it is the application of scientific information concerning humans to the design of systems. Systems are for human use, which I think most systems are.

Simon:  Indeed. Otherwise, why would humans build them?

Peter:   That’s right. Generally speaking, sure.

Simon:  So, given that this is a thing that people do then. Perhaps we’re not so good at including the human unless we think about it specifically?

Peter:   I think that’s fairly accurate. I would say that if you look across industries, and industries are perhaps better at integrating Human Factors, considerations or Human Factors into the design lifecycle, that they have had to do so because of the accidents that have occurred in the past. You could probably say this about safety engineering as well, right?

Simon:  And this is true, yes.

Peter:   In a sense, you do it because you have to because the implications of not doing it are quite significant. However, I would say the upshot, if you look at some of the evidence –and you see this also across software design and non-safety critical industries or systems –that taking into account human considerations early in the design process typically ends up in better system performance. You might have more usable systems, for example. Apple would be an example of a company that puts a lot of focus into human-computer interaction and optimizing the interface between humans and their technologies and ensuring that you can walk up and use it fairly easily. Now as time goes on, one can argue how out how well Apple is doing something like that, but they were certainly very well known for taking that approach.

Simon:  And reaped the benefits accordingly and became, I think, they were the world’s number one company for a while.

Peter:   That’s right. That’s right.

Simon:  So, thinking about the, “So why do it?” What is one of the benefits of doing Human Factors well?

Peter:   Multiple benefits, I would say. Clearly, safety and safety-critical systems, like health and safety; Performance, so system performance; Efficiency and so forth. Job satisfaction and that has repercussions that go back into, broadly speaking, that society. If you have meaningful work that has other repercussions and that’s sort of the angle I originally came into all of this from. But, you know, you could be looking at just the safety and efficiency aspects.

Simon:  You mentioned meaningful work: is that what attracted you to it?

Peter:   Absolutely. Absolutely. Yes. Yes, like I said I had a keen interest in the sociology of work and looking at work organization. Then, for my master’s degree, I looked at lean production, which is the Toyota approach to producing vehicles. I looked at multiskilled teams and multiskilling and job satisfaction. Then looking at stress indicators and so forth versus mass production systems. So that’s really the angle I came into this. If you look at it, mass production lines where a person is doing the same job over and over, it’s quite repetitive and very narrow, versus the more Japanese style lean production. There are certainly repercussions, both socially and individually, from a psychological health perspective.

Simon:  So, you get happy workers and more contented workers-

Peter:   –And better quality, yeah.

Simon:  And again, you mentioned Toyota. Another giant company that’s presumably grown partly through applying these principles.

Peter:   Well, they’re famous for quality, aren’t they? Famous for reliable, high-quality cars that go on forever. I mean, when I moved from Canada to Australia, Toyota has a very, very strong history here with the Land Cruiser, and the high locks, and so forth.

Simon:  All very well-known brands here. Household names.

Peter:   Are known to be bombproof and can outlast any other vehicle. And the lean production system certainly has, I would say, quite a bit of responsibility for the production of these high-quality cars.

Simon:  So, we’ve spoken about how you got into it and “What is it?” and “Why do it?” I suppose, as we’ve said, what it is in very general terms but I suspect a lot of people listening will want to know to define what it is, what Human Factors is, based on doing it. On how you do it. It’s a long, long time since I did my Human Factors training. Just one module in my masters, so could you take me through what Human Factors involves these days in broad terms.

Peter:   Sure, I actually have a few slides that might be useful –  

Simon:  – Oh terrific! –

Peter:   –maybe I should present that. So, let me see how well I can share this. And of course, sometimes the problem is I’ll make sure that – maybe screen two is the best way to share it. Can you see that OK?

Simon:  Yeah, that’s great.

Introduction to Human Factors

Peter:   Intro to Human Factor. So, as Stewart Dickinson, who I work with at human risk solutions and I have prepared some material for some courses we taught to industry. I’ve some other material and I’ll just flip to some of the key slides going through “What is Human Factors”. So, let me try to get this working and I’ll just flip through quickly.

Definitions of Human Factors

Peter:   So, as I’ve mentioned already, broadly speaking, environmental, organizational, and job factors, and human individual characteristics which influence behaviour at work in a way that can which can affect health and safety. That’s a focus of Human Factors. Or the application of scientific information concerning humans to the design of objects, systems and environments for human use. You see a pattern here, fitting work to the worker. The term ergonomics is used interchangeably with Human Factors. It also depends on the country you learn this in or applied in.

Simon:  Yes. In the U.K., I would be used to using the term ergonomics to describe something much narrower than Human Factors but in Australia, we seem to use the two terms as though they are the same.

Peter:   It does vary. You can say physical ergonomics and I think that would typically represent when people think of ergonomics, they think of the workstation design. So, sitting at their desk, heights of tables or desks, and reach, and so on. And particularly given the COVID situation, there are so many people sitting at their desks are probably getting some repetitive strain –

Simon:  –As we are now in our COVID 19 [wo]man caves.

Peter:   That’s right! So that’s certainly an aspect of Human Factors work because that’s looking at the interaction between the human and the desk/workstation system, so to speak, on a very physical level.        

            But of course, you have cognitive ergonomics as well, which looks of perceptual and cognitive aspects of that work. So Human Factors or ergonomics, broadly speaking, would be looking at these multi-dimensional facets of human interaction with systems.

Definitions of Human Factors (2)

Peter:   Some other examples might be the application of knowledge of human capabilities and limitations to design, operation and maintenance of technological systems, and I’ve got a little distilled –or summarized- bit on the right here. The Human Factors apply scientific knowledge to the development and management of the interfaces between humans and rail systems. So, this is obviously in the rail context so you’re, broadly speaking, talking in terms of technological systems. That covers all of the people issues. We need to consider to assure safe and effective systems or organizations.

Again, this is very broad. Engineers often don’t like these broad topics or broad approaches. I’m an engineer, I learned this through engineering which is a bit different than how some people get into Human Factors.

Simon:  Yeah, I’ve met a lot of human factor specialists who come in from a first degree in psychology.

Peter:   That’s right. I’d say that’s fairly common, particularly in Australia and the UK. Although, I know that you could take it here in Australia in some of the engineering schools, but it’s fairly rare. There’s an aviation Human Factors program, I think, at Swinburne University. They used to teach it through mechanical engineering there as well. I did a bit of teaching into that and I’m not across all of the universities in Australia, but there are a few. I think the University of the Sunshine Coast has quite a significant group at the moment that’s come from, or, had some connection to Monash before that. Well, I think about, when I’m doing this work, of “What existing evidence do we have?” Or existing knowledge base with respect to the human interactions with the system. For example, working with a rail transport operator, they will already have a history of incidents or history of issues and we’d be looking to improve perhaps performance or reduce the risk associated with the use of certain systems. Really focusing on some of the evidence that exists either already in the organization or that’s out there in the public domain, through research papers and studies and accident analyses and so forth. I think much like safety engineering, there would be some or quite a few similarities in terms of the evidence base –

Simon:  – Indeed.

Peter:   – Or creating that evidence through analysis. So, using some analytical techniques, various Human Factors methods and that’s where Human Factors sort of comes into its own. It’s a suite of methods that are very different from what you would find in other disciplines.

Simon:  Sure, sure. So, can you give us an overview of these methods, Peter?

Peter:   There are trying to think of a slide for this. Hopefully, I do.

Simon:  Oh, sorry. Have I taken you out of sequence?

Peter:   No, no. Not out of sequence. Let me just flip through, and take a look at –

The Long Arm of Human Factors

Peter:   This is probably a good sort of overview of the span of Human Factors, and then we can talk about the sorts of methods that are used for each of these – let’s call them –dimensions. So, we have what’s called the long arm of Human Factors. It’s a large range of activities from the very sort of, as we’re talking about, physical ergonomics, e.g. sitting at a desk and so on, manual handling, workplace design, and moving to interface design with respect to human-machine interfaces- HMIs, as they’re called, or user interfaces. There are techniques, manual handling techniques and analysis techniques – You might be using something like a task analysis combined with a NIOSH lifting equation and so on. Workplace design, you’d be looking at anthropocentric data. So, you would have a dataset that’s hopefully representative of the population you’re designing for, and you may have quite specific populations. So Human Factors, engineering is fairly extensively used, I would say, in military projects –in the military context-

Simon:  – Yes.

Peter:   – And there’s this set of standards, the Mil standard, 1472G, for example, from the United States. It’s a great example that gives not only manual handling standards or guidelines, workplace design guidelines in the workplace, in a military sense, can be a vehicle or on a ship and so on. Or on a base and so forth.

Interface design- So, if you’re looking at from a methods perspective, you might have usability evaluations, for example. You might do workload’s studies and so forth, looking at how well the interface supports particular tasks or achieving certain goals.

            Human error –There are human error methods that typically leverage off of task models. So, you’d have a task model and you would look at for that particular task, what sorts of errors could occur and the structured methods for that?

Simon:  Yes, I remember human task analysis –seeing colleagues use that on a project I was working on. It seemed quite powerful for capturing these things.

Peter:   It is and you have to pragmatically choose the level of analysis because you could go down to a very granular level of detail. But that may not be useful, depending on the sort of system design you’re doing, the amount of money you have, and how critical the task is. So, you might have a significantly safety-critical task, and that might need quite a detailed analysis. An example there would be – there was a … I think it’s the … You can look up the accident analysis online, I believe it’s the Virgin Galactic test flight. So this is one of these test flights in the U.S. – I have somewhere in my archive of accident analyses – where the FAA had approved the test flights to go ahead and there was a task where – I hope I don’t get this completely wrong – where one of the pilots (there are two pilots, a pilot and a co-pilot) and this test aeroplane where they had to go into high-altitude in this near-space vehicle. They were moving at quite a high speed and there was a particular task where they had to do something with – I think they had to slow down and then you could … slow down their aeroplane, I guess, by reducing the throttle and then at a certain point/a certain speed, you could deploy, or control, the ailerons or some such, wing-based device, and the task order was very important. And what had happened was a pilot or the co-pilot had performed the task slightly out of order. As a matter of doing one thing first before they did another thing that led to the plane breaking up. And fortunately, one of the pilots survived, unfortunately, one didn’t.

Simon:  So, very severe results from making a relatively small mistake.

Peter:   So that’s a task order error, which is very easy to do. And if the system had been designed in a way to prevent that sort of capability to execute that action at that point. That would have been a safer design. At that level, you might be going down to that level of analysis and kind of you get called keystroke level analysis and so on

Simon:  – Where it’s justified, yes.

Peter:   Task analysis is, I think, probably one of the most common tools used. You also have workload analysis, so looking at, for example, interface design. I know some of the projects we were working on together, Simon, workload was a consideration. There are different ways to measure workload. There’s a NASA TLX, which is a subjective workload. Questionnaire essentially, that’s done post-task but it’s been shown to be quite reliable and valid as well. So, that instrument is used and there are a few others that are used. It depends on the sort of study you’re doing, the amount of time you have and so forth. Let me think, that’s workload analysis.

Safety culture- I wouldn’t say that’s my forte. I’ve done a bit of work on safety culture, but that’s more organizational and the methods there tend to be more around culpability models and implementing those into the organizational culture.

Simon:  So, more governance type issues? That type of thing?

Peter:   Yes. Governance and – whoops! Sorry, I didn’t mean to do that. I’m just looking at the systems and procedure design. The ‘e’ is white so it looks like it’s a misspelling there. So it’s annoying me …

Simon:  – No problem!

Peter:   Yes. So, there are models I’ve worked with at organization such as some rail organizations where they look at governance, but also in terms of appropriate interventions. So, if there’s an incident, what sort of intervention is appropriate? So, essentially use sort of a model of culpability and human error and then overlay that or use that as a lens upon which to analyse the incident. Then appropriately either train employees or management and so on. Or perhaps it was a form of violation, a willful violation, as it may be –

Simon:  – Of procedure?

Peter:   Yeah, of procedure and so on versus a human error that was encouraged by the system’s design. So, you shouldn’t be punishing, let’s say, a train driver for a SPAD if the –

Simon:  – Sorry, that’s a Signal Passed At Danger, isn’t it?

Peter:   That’s right. Signal Passed At Danger. So, it’s certainly possible that the way the signalling is set up leads to a higher chance of human error. You might have multiple signals at a location and it’s confusing to figure out which one to attend to and you may misread and then you end up SPADing and so on. So, there are, for example, clusters of SPADs that will be analysed and then the appropriate analysis will be done. And you wouldn’t want to be punishing drivers if it seemed to be a systems design issue.

Simon:  Yes. I saw a vivid illustration of that on the news, I think, last night. There was a news article where there was an air crash that tragically killed three people a few months ago here in South Australia. And the newsies report today is saying it was human error but when they actually got in to reporting what had happened, it was pointed out that the pilot being tested was doing – It was a twin-engine aeroplane and they were doing an engine failure after take-off drill. And the accident report said that the procedure that they were using allowed them to do that engine failure drill at too low an altitude. So, if the pilot failed to take the correct action very quickly – bearing in mind this is a pilot being tested because they are undergoing training – there was no time to recover. So, therefore, the aircraft crashed. So, I thought, ”Well, it’s a little bit unfair just to say it’s a human error when they were doing something that was in intrinsically inappropriate for a person of that skill level.”

Peter:   That’s an excellent example and you hear this in the news a lot. Human error, human error and human error. The cause of this, I think, with the recent Boeing problems with the flight control system for the new 737s. And of course, there will be reports. Some of the interim reports already talk about some of these Human Factors, issues inherent in that, and I would encourage people to look up the publicly available documentation on that-

Simon:  – This is the Boeing 737 Max accidents in Indonesia and in Ethiopia, I think.

Peter:   That’s correct. That’s correct. Yes, absolutely. And pilot error was used as the general explanation but under further analysis, you started looking at that error. That so to speak error perhaps has other causes which are systems design causes, perhaps. So these things are being investigated but have been written about quite extensively. And you can look at, of course, any number of aeroplane accidents and so on. There’s a famous Air France one flying from Brazil to Paris, from what I recall. It might have been Rio de Janeiro to Paris. Where the pitot –

Simon:  – Yeah, pitot probes got iced up.

Peter:    Probes, they iced up and it was dark. So, the pilots didn’t have any ability to gauge by looking outside. I believe it was dark or it might have been a storm. There’s some difficulty in engaging what was going on outside of the aeroplane and there again misreads. So, stall alarms going off and so off, I believe. There were some mis-readings on the airspeed coming from the sensors, essentially. And then the pilots acted according to that information, but that information was incorrect. So, you could say there were probably a cascade of issues that occurred there and there’s a fairly good analysis one can look up that looks at the design. I believe it was an Airbus. It was the design of the Airbus. So, we had one pilot providing an input in one direction to the control yoke and the other pilot in the other direction. There are a number of things that broke down. And typically, you’ll see this in accidents. You’ll have a cascade as they’re trying to troubleshoot and can’t figure out what’s going on they’ll start applying various approaches to try and remedy the situation and people begin to panic and so on.

            And you have training techniques, a crew resource management, which certainly has a strong Human Factors element or comes out of the Human Factors world, which looks at how to have teams and cockpits. And in other situations working effectively in emergency situations And that’s sort of after analysing, of course, failures.

Simon:  Yes, and I think CRM, crew resource management, has been adopted not just in the airline industry, but in many other places as well, hasn’t it?

Peter:   Operating theatres, for example. There’s quite a bit of work in the 90s that started with I think it was David Gaba who I think was at Stanford – this is all from memory. That then look at operating theatres. In fact, the Monash Medical Centre in Clayton had a simulation centre for operating theatres where they were applying these techniques to training operating theatre personnel. So, surgeons, anaesthetists, nurses and so forth.

Simon:  Well, thanks, Peter. I think and I’m sorry, I think I hijacked you’ll the presentation, but –

Peter:   It’s not really a presentation anyway. It was more a sort of better guidance there. We’re talking about methods, weren’t we? And it’s easy to go then from methods to talking about accidents. Because then we talk about the application of some of these methods or if these methods are applied to prevent accidents from occurring.

Simon:  Cool. Well, thanks very much, Peter. I think maybe I’ll let the next time we have a chat I’ll let you talk through your slides and we’ll have a more in-depth look across the whole breadth of Human Factors.

Peter:   So that’s probably a good little intro at the moment anyway. Perhaps I might pull up one slide on Human Factors integration before we end.

Simon:  Of course.

Peter:   I’ll go back a few slides here.

What is Human Factors Integration?

Peter:   And so what is Human Factors integration? I was thinking about this quite a bit recently because I’m working on some complex projects that are very, well, not only complex but quite large engineering projects with lots of people, lots of different groups involved, different contracts and so forth. And the integration issues that occur. They’re not only Human Factors integration issues there are larger-scale integration issues, engineering integration issues. Generally speaking, this is something I think that projects often struggle with. And I was really thinking about the Human Factors angle and Human Factors integration. That’s about ensuring that all of the HF issues, so HF in Human Factors, in a project are considered in control throughout the project and deliver the desired performance and safety improvements. So, three functions of Human Factors integration

  • confirm the intendant system performance objectives and criteria
  • guide and manage the Human Factors, aspects and design cycles so that negative aspects don’t arise and prevent the system reaching its optimum performance level
  • and identify and evaluate any additional Human Factors safety aspect now or we found in the safety case.

You’ll find, particularly in these complex projects, that the interfaces between the –  you might have quite a large project and have some projects working on particular components. Let’s say one is working on more of a civil/structural elements and maybe space provisioning and so on versus another one is working more on control systems. And the integration between those becomes quite difficult because you don’t really have that Human Factors integration function working to integrate those two large components. Typically, it’s within those focused project groupings –that’s the way to call them. Does that make sense?

Simon:  Yeah. Yeah, absolutely.

Peter:   I think that’s one of the big challenges that I’m seeing at the moment, is where you have a certain amount of time and money and resource. This would be common for other engineering disciplines and the integration work often falls by the wayside, I think. And that’s where I think a number of the ongoing Human Factors issues are going to be cropping up some of these large-scale projects for the next 10 to 20 years. Both operationally and perhaps safety as well. Of course, we want to avoid –

Simon:  –Yes. I mean, what you’re describing sounds very familiar to me as a safety engineer and I suspect to a lot of engineers of all disciplines who work on large projects. They’re going to recognize that as it is a familiar problem.

Peter:   Sure. You can think about if you’ve got the civil and space provisioning sort of aspect of a project and another group is doing what goes into, let’s say, a room into a control room or into a maintenance room and so on. It may be that things are constrained in such a way that the design of the racks in the room has to be done in a way that makes the work more difficult for maintainers. And it’s hard to optimize these things because these are complex projects and complex considerations. And a lot of people are involved in them. The nature of engineering work is typically to break things down into little elements, optimize those elements and bring them all together.

Simon:  –Yes.

Peter:   Human Factors tends to –Well, you can do them Human Factors as well but I would argue that certainly what attracted me to it, is that you tend to have to take a more holistic approach to human behaviour and performance in a system.

Simon:  Absolutely.

Peter:   Which is hard.

Simon:   Yes, but rewarding. And on that note, thanks very much, Peter. That’s been terrific. Very helpful. And I look forward to our next chat.

Peter:   For sure. Me too. Okay, thanks!

Simon:  Cheers!

Outro

Simon:  Well, that was our first chat with Peter on the Safety Artisan and I’m looking forward to many more. So, it just remains for me to say thanks very much for watching and supporting the work of what we’re doing and what we’re trying to achieve. I look forward to seeing you all next time. Okay, goodbye.

END

Back to the Home Page | System Safety Page

#Safety #Engineering #Training
Categories
Mil-Std-882E Safety Analysis

System Hazard Analysis

In this 45-minute session, The Safety Artisan looks at System Hazard Analysis, or SHA, which is Task 205 in Mil-Std-882E. We explore Task 205’s aim, description, scope, and contracting requirements. We also provide value-adding commentary, which explains SHA – how to use it to complement Sub-System Hazard Analysis (SSHA, Task 204) in order to get the maximum benefits for your System Safety Program.

This is the seven-minute-long demo. The full video is 47 minutes long.

System Hazard Analysis: Topics

  • Task 205 Purpose [differences vs. 204];
    • Verify subsystem compliance;
    • ID hazards (subsystem interfaces and faults);
    • ID hazards (integrated system design); and
    • Recommend necessary actions.
  • Task Description (five slides);
  • Reporting;
  • Contracting; and
  • Commentary.
Transcript: System Hazard Analysis

Introduction

Hello, everyone, and welcome to the Safety Artisan, where you will find professional, pragmatic, and impartial safety training resources and videos. I’m Simon, your host, and I’m recording this on the 13th of April 2020. And given the circumstances when I record this, I hope this finds you all well.

System Hazard Analysis Task 205

Let’s get on to our topic for today, which is System Hazard Analysis. Now, system hazard analysis is, as you may know, is Task 205 in the Mil. Standard 882E system safety standard.

Topics for this Session

What we’re going to cover in this session is purpose, task description, reporting, contracting and some commentary – although I’ll be making commentary all the way through. Going to the back to the top, the yellow highlighting with this and with task 204, I’m using the yellow highlighting to indicate differences between 205 and 204 because they are superficially quite similar. And then I’m using underlining to emphasize those things that I want to really bring to your attention and emphasize. Within task 205, purpose. We’ve got four purposes for this one. Verify subsistent compliance and recommend necessary actions – fourth one there. And then in the middle of the sandwich, we’ve got identification of hazards, both between the subsystem interfaces and faults from the subsystem propagating upwards to the overall system and identifying hazards in the integrated system design. So, quite different emphasis to 204 which was really thinking about subsystems in isolation. We’ve got five slides of task description, a couple on reporting, one on contracting – nothing new there – and several commentaries.

System Requirements Hazard Analysis (T205)

Let’s get straight on with it. The purpose, as we’ve already said, there is a three-fold purpose here; Verify system compliance, hazard identification and recommended actions, and then, as we can see in the yellow, the identifying previously unidentified hazards is split into two. Looking at subsystem interfaces and faults and the integration of the overall system design. And you can see the yellow bit, that’s different from 204 where we are taking this much higher-level view, taking an inter subsystem view and then an integrated view.

Task Description (T205) #1

On to the task description. The contract has got to do it and documented, as usual, looking at hazards and mitigations, or controls, in the integrated system design, including software and human interface. It’s very important that we’ll come onto that later. All the usual stuff about we’ve got to include COTS, GOTS, GFE and NDI. So, even if stuff is not being developed, if we’re putting together a jigsaw system from existing pieces, we’ve still got to look at the overall thing. And as with 204, we go down to the underlined text at the bottom of the slide, areas to consider. Think about performance, and degradation of performance, functional failures, timing and design errors, defects, inadvertent functioning – that classic functional failure analysis that we’ve seen before. And again, while conducting this analysis, we’ve got to include human beings as an integral component of the system, receiving inputs, and initiating outputs.  Human factors were included in this standard from long ago.

Task Description (T205) #2

Slide two. We’ve got to include a review of subsystem interrelationships. The assumption is that we’ve previously done task 204 down at a low level and now we’re building up to task 205. Again, verification of system compliance with requirements (A.), identification of new hazards and emergent hazards, recommendations for actions (B.), but Part C is really the new bit. We are looking at possible independent, dependent, and simultaneous events (C.) including system failures, failures of safety devices, common cause failures, and system interactions that could create a hazard or increase risk. And this is really the new stuff in 205 and we are going to emphasize in the commentary, you’re going to look very carefully at those underlying things because they are key to understanding task 205.

Task Description (T205) #3

Moving on to Slide 3, all new stuff, all in yellow. Degradation of the system or the total system (D.), design changes that affect subsystems (E.). Now, I’ve underlined this because what’s the constant in projects? It’s change. You start off thinking you’re going to do something and maybe the concept changes subtly or not so subtly during the project. Maybe your assumptions change the schedule changes, the resources available change. You thought you were going to get access to something, but it turns out that you’re not. So, all these things can change and cause problems, quite frankly, as I am sure we know. So, we need to deal with not just the program as we started out, but the program as it turns out to be – as it’s actually implemented. And that’s something I’ve seen often go awry because people hold on to what they started out with, partly because they’re frightened of change and also because of the work of really taking note changes. And it takes a really disciplined program or project manager to push back on random change and to control it well, and then think through the implications. So, that’s where strength of leadership comes in, but it is difficult to do.

Moving on now. It says effects of human errors (F.) in the blue, I’ve changed that. Human error implies that the human is at fault, that the human made a mistake. But very often, we design suboptimal systems and we just expect the human operator to cope. Whether it’s fair or unfair or unreasonable, it results in accidents. So, what we need to think about more generally is erroneous human action. So, something has gone wrong but it’s not necessarily the humans’ fault. Maybe the system has induced the human to make an error. We need to think very carefully about.

Moving on, determination (G.), potential contribution of all those components in G. 1. As we said before, all the non-developmental stuff. G.2, have design requirements in the specifications being satisfied? This standard emphasizes specifications and meeting requirements, we’ve discussed that in other lessons. G.3 and whether methods of system implementation have introduced any new hazards. Because of course, in the attempted to control hazards, we may introduce technology or plant or substances that themselves can create problems. So, we need to be wary of that.

Task Description (T205) #4

Moving on to slide four. Now, in 205.2.2, the assumption here is that the PM has specified methods to be used by the contractor. That’s not necessarily true, the PM may not be an expert in this stuff. While they may for contractual or whatever reasons have decided we want the contractor to decide what techniques to use. But the assumption here is that the PM has control and if the contractor decides they want to do something different they’ve got to get the PM’s authority to do that. This is assuming, of course, that the this has been specified in the contract.

And 205.2.3, whichever contractor is performing the system hazard analysis, the SHA, they are expected to have oversight of software development that’s going to be part of their system. And again, that doesn’t happen unless it’s contracted. So, if you don’t ask for it, you’re not going to get it because it costs money. So, if the ultimate client doesn’t insist on this in the contract and police it to be fair because it’s all very well asking for stuff. If you never check what you’re getting or what’s going on, you can’t be sure that it’s really happening. As an American Admiral Rickover once said, “You get the safety you inspect”. So, if you don’t inspect it, don’t expect to get anything in particular, or it’s an unknown. And again, if anything requires mitigation, the expectation in the standard is that it will be reported to the PM, the client PM this is and that they will have authority. This is an assumption in the way that the standard works. If you’re not going to run your project like that, then you need to think through the implications of using this standard and manage accordingly.

Task Description (T205) #5

And the final slide on task description. We’ve got another reminder that the contractor performing the SHA shall evaluate design changes. Again, if the client doesn’t contract for this it won’t necessarily happen. Or indeed, if the client doesn’t communicate that things have changed to the contractor or the subcontractors don’t communicate with the prime contractor then this won’t happen. So, we need to put in place communication channels and insist that these things happen. Configuration control, and so forth, is a good tool for making sure that this happens.

Reporting (T205) #1

So, if we move on to reporting, we’ve got two slides on this. No surprises, the contractor shall prepare a report that contains the results from the analysis as described. First, part A, we’ve got to have a system description. Including the physical and functional characteristics and subsystem interfaces. Again, always important, if we don’t have that system description, we don’t have the context to understand the hazard analysis that had been done or not being done for whatever reason. And the expectation is that there will be reference to more detailed information as and when it becomes available. So maybe detailed design stuff isn’t going to emerge until later, but it has to be included. Again, this has got to be required.

Reporting (T205) #2

Moving onto parts B and C. Part B as before we need to provide a description of each analysis method used, the assumptions made, and the data used in that analysis. Again, if you don’t do this, if you don’t include this description, it’s very hard for anybody to independently verify that what has been done is correct, complete, and consistent. And without that assurance, then that’s going to undermine the whole purpose of doing the analysis in the first place.

And then part C, we’ve got to provide the analysis results and at the bottom of this subparagraph is the assumption. The analysis results could be captured in the hazard tracking system, say the hazard log, but I would only expect the sort of leading to be captured in that hazard log. And the detail is going to be in the task 205 hazard analysis report, or whatever you’re calling it. We’ve talked about that before, so I’m not going to get into that here.

Contracting

And then the final bit of quotation from the standard is the contracting. And again, all the same things that you’ve seen before. We need to require the task to be completed. It’s no good just saying apply Mil. Standard 882E because the contractor, if they understand 882E, they will tailor it to suit selves, not the client. Or if they don’t understand 882E they may not do it at all, or just do it badly. Or indeed they may just produce a bunch of reports that have got all the right headings in as the data item description, which is usually supplied in the contract, but there may be no useful data under those headings. So, if you haven’t made it clear to the contractor, they need to conduct this analysis and then report on the results – I know it sounds obvious. I know this sounds silly having to say this, but I’ve seen it happen. You’ve got a contractor that does not understand what system safety is.

(Mind you, why have you contracted them in the first place to do this? You should know that you should have done your research, found out.)

But if it’s new to them, you’re going to have to explain it to them in words of one syllable or get somebody else to do it for them. And in my day job, this is very often what consultancies get called in to do. You’ve got a contractor who maybe is expert building tanks, or planes, or ships, or chemical plants, or whatever it might be, but they’re not expert in doing this kind of stuff. So, you bring in a specialist. And that’s part of my day job.

So, getting back to the subject. Yes, we’ve got to specify this stuff. We’ve got to specify it early, which implies that the client has done quite a lot of work to work this all out. And again, the client may above the line, as we say, say engage a consultant or whoever to help them with this, a specialist. We’ve got to include all of the details that are necessary. And of course, how do you know what’s necessary, unless you’ve worked it out. And you’ve got to supply the contractor, it says concept of operations, but really supplying the contractor with as much relevant data and information as you can, without bogging them down. But that context is important to getting good results and getting a successful program.

Illustration

I’ve got a little illustration here. The supposition in the standard in Task 205 is we’ve got a number of subsystems and there may be some other building blocks in there as well. And some infrastructure we’ve going to have probably some users, we’re going to have an operating environment, and maybe some external systems that our system, or the system of interest, interfaces with or interacts with in some way. And that interaction might be deliberate, or it might be just in the same operating environment at night. And they will interact intentionally or otherwise.

Commentary – Go Early

With that picture in mind, let’s think about some important points. And the first one is to get 205, get some 205-work done early. Now, the implication in the standard by the numbering and when you read the text is that subsystem hazard analysis comes first. You do those hexagonal building blocks first and then you build it up and task 205 comes after the subsystem hazard analysis. You thought, “Well, you’ve already got the SHHAs for each subsystem and then you build the SHA on top”. However, if you don’t do 205 early, you’re going to lose an opportunity to influence the design and to improve your system requirements. So, it’s worth doing an initial pass of 205 first, top-down, before you do the 204 hexagons and then come back up and redo 205. So, the first pass is done early to gain insight, to influence the design, and to improve your requirements, and to improve, let’s say, the prime contractor’s appreciation and reporting of what they are doing. And that’s really, dare I say, a quick and dirty stab at 205 could be quite cheap and will probably the payback/the return on investment should be large if you do it early enough. And of course, act on the results.

And then the second part is more about verifying compliance, verifying those as required interfaces, and looking at emergent stuff, stuff that’s emerged – the devil’s in the detail as the saying goes. We can look at the emerging stuff that’s coming out of that detail and then pull all that together and tidy up it up and look for emergent behaviour.

Commentary – Tools & Techniques

Looking at tools and techniques, most safety analysis techniques that we use look at single events or single failures only in isolation. And usually, we expect those events and failures to be independent. So, there’re lots of analyses out there. Basic fault tree analysis, event tree analysis. Well, event tree is slightly different in that we can think about subsequent failures, but there’re lots of basic techniques out there that will really only deal with a single failure at a time. However, 205.2.1C requires us to go further. We’ve got to think about dependent simultaneous events and common cause failures. And for a large and complex system, each of those can be a significant undertaking. So, if we’re doing task 205 well, we are going to push into these areas and not simply do a copy of task 204, but at a higher level. We’re now really talking about the second pass of 205. The previous, quick and dirty, 205 is done. The task 204 on the subsystems is done. Now we’re pulling it all together.

Dependent & Simultaneous Events

Let’s think about independent simultaneous events. First, dependent failures. Can an initial failure propagate? For example, a fire could lead to an explosion or an explosion could lead to a fire. That’s a classic combination. If something breaks or wears could be as simple as components wearing and then we get debris in the lubrication system. Could that – could the debris from component wear clog up the lubrication system and cause it to fail and then cause a more serious seizure of the overall system? Stuff like that. Or there may be more subtle functional effects. For example, electric effects, if we get a failure in an electrical system or even non-failure events that happen together. Could we get what’s called a sneak circuit? Could we get a reverse flow of current that we’re not expecting? And could that cause unexpected effects? There’s a special technique we’re looking at called sneak circuits analysis. That’s sneak, SNEAK, go look it up if you’re interested. Or could there be multiple effects from one failure? Now, I’ve already mentioned fire. It’s worth repeating again. Fire is the absolute classic. First, the effects of fire. You’ve got the fire triangle. So, to get fire, we need an inflammable substance, we need an ignition source, and we need heat. And without all three, we don’t get a fire. But once we do get a fire, all bets are off, and we can get multiple effects. So, we recall, you might remember from being tortured doing thermodynamics in class, you might remember the old equation that P1V1T1 equals P2V2T2. (And I’ve put R2 that for some reason, so sorry about that.)

What that’s saying is, your initial pressure, volume and temperature multiplied together, P1V1T1, is going to be the same as your subsequent pressure, volume and temperature multiply together, P2V2T2. So, what that means is if you dramatically increase the temperature say, because that’s what a fire does, then your volume and your pressure are going to change. So, in an enclosed space we get a great big increase in pressure, or if we’re in an unenclosed space, we’re going to get an increase in volume in a [gas or] fluid. So, if we start to heat the [gas or] fluid, it’s probably going to expand. And then that could cause a spill and further knock-on effects. And fire, as well as effect making pressure and volume changes to the fluids, it can weaken structures, it makes smoke, and produces toxic gases. So, it can produce all kinds of secondary hazardous effects that are dangerous in themselves and can mess up your carefully orchestrated engineering and procedural controls. So, for example, if you’ve got a fire that causes a pressure burst, you can destroy structures and your fire containment can fail. You can’t send necessarily people in to fix the problem because the area is now full of smoke and toxic gas. So, fire is a great example of this kind of thing where you think, “Well, if this happens, then this really messes up a lot of controls and causes a lot of secondary effects”. So, there’s a good example, but not the only one.

And then simultaneous events, a hugely different issue. What we’re talking about here is we have got undetected, or latent, failures. Something has failed, but it’s not apparent that it’s failed, we’re not aware, and that could be for all sorts of reasons. It could be a fatigue failure. We’ve got something that’s cracked, or it could be thermal fatigue. So, lots of things that can degrade physical systems, make them brittle. For example, an odd one, radiation causes most metals to expand and neutron bombardment makes them brittle. So, it can weaken things, structure and so forth. Or we might have a safety system that has failed, but because we’ve not called upon it in anger, we don’t notice. And then we have a failure, maybe the primary system fails. We expect the secondary system to kick in, but it doesn’t because there’s been some problem, or some knock-on effect has prevented the secondary system from kicking in. And I suspect we’ve all seen that happen.

My own experience of that was on a site I was working on. We had a big electricity failure, a contractor had sawed through the mains electricity cable or dug through it. And then, for some unknown reason, the emergency generators failed to kick in. So, that meant that a major site where thousands of people worked had to be evacuated because there was no electricity to run the computers. Even the old analogue phones failed after a while. Today, those phones would be digital, probably voice over IP, and without electricity, they’d fail instantly. And eventually, without power for the plumbing, the toilets back up. So, you’re going to end up having to evacuate the entire site because it’s unhygienic. So, some effects can be very widespread. Just because you had a late failure, and your backup system didn’t kick in when you expected it to.

So how can we look at that? Well, this is classic reliability modelling territory. We can look at meantime between failures, MTBF, and meantime to repair (MTTR) and therefore we could work out what the exposure time might be. We can work out, “What’s the likelihood of a latent failure occurring?” If we’ve got an interval, presumably we’ve going to test the system periodically. We’ve got to do a proof test. How often do we have to do the proof test to get a certain level of reliability or availability when we need the system to work? And we can look at synchronous and asynchronous events. And to do that, we can use several techniques. The classic ones, reliability, lock diagrams and fault tree analysis. Or if we’ve got repairable systems, we can use Markov chain modelling, which is very powerful. So, we can bring in time-dependent effects of systems failing at certain times and then being required, or systems failing and being repaired, and look at overall availability so that we can get an estimate of how often the overall system will be available. If we look at potential failures in all the redundant constituent parts. Lots of techniques there for doing that, some of them quite advanced. And again, very often this is what safety consultants, this is what we find ourselves doing so.

Common Cause Failures

Common cause failure, this is another classic. We might think about something very obvious and physical, maybe we get debris, maybe we’ve got three sets of input channels guarded by filters to stop debris getting into the system, but what if debris blocks all the filters so we get no flow? So, obvious – I say obvious – often missed sources of sometimes quite major accidents. Or let’s say something more subtle, we’ve got three redundant channels, or a number of redundant channels, in an electronic system and we need two out of three to work, or whatever it might be. But we’ve got the same software working each channel. So, if the software fails systematically, as it does, then potentially all three channels will just fail at the same time.

So, there’s a good example of non-independent failures taking down a system that on paper has a very high reliability but actually doesn’t. Once you start considering common cause failure or common mode analysis. So, really what we would like is we would like all redundancy to be diverse if possible. So, for example, if we wanted to know how much fuel we had left in the aeroplane, which is quite important if you want the engines to keep working, then we can employ diverse methods. We can use sensors to measure how much fuel is in the tanks directly and then we can cross-check that against a calculated figure where we’ve entered, let’s say, how much fuel was in the tanks to start with. And then we’ve been measuring the flow of fuel throughout the flight. So, we can calculate or estimate the amount of fuel and then cross-check that against the actual measurements in the tanks. So, there’s a good diverse method. Now, it’s not always possible to engineer a diverse method, particularly in complex systems. Sometimes there’s only really one way of doing something. So, diversity kind of goes out of the window in such an engineered system.

But maybe we can bring a human in.

So, another classic in the air world, we give pilots instruments in order to tell them what’s going on with the aeroplane, but we also suggest that they look out the window to look at reality and cross-check. Which is great if you’re not flying a cloud or in darkness and there are maybe visual references so you can’t necessarily cross-check. But even things like system failures, can the pilot look out the window and see which propeller has stopped turning? Or which engine the smoke and flames coming out of? And that might sound basic and silly, but there have been lots of very major accidents where that hasn’t been done and the pilots have shut down the wrong engine or they’ve managed the wrong emergency. And not just pilots, but operators of nuclear power plants and all kinds of things. So, visual inspection, going and looking at stuff if you have time, or take some diverse way of checking what’s going on, can be very helpful if you’re getting confusing results from instrument readings or sensor readings.

And those are examples of the terrific power of human diversity. Humans are good at taking different sensory inputs and fusing them together and forming a picture. Now, most of the time they fuse the data well and they get the correct picture, but sometimes they get confused by a system or they get contradictory inputs and they get the wrong mental model of what’s going on and then you can have a really bad accident. So, thinking about how we alert humans, how we use alarms to get humans attention, and how we employ human factors to make sure that we give the humans the right input, the right mental picture, mental model, is very important. So, back to human factors again, especially important, at this level for task 205.

And of course, there are many specialist common cause failure analysis techniques so we can use fault trees. Normally in a fault tree when you’ve got an and gate, we assume that those two sub-events are independent, but we can use ‘beta factors’ (they’re called) to say, “Let’s say event a and event b are not independent, but we think that 50 percent or 10 percent of the time they will happen at the same time”. So, you can put that beta factor in to change the calculation. So, fault trees can cope with non-independent fate is providing you program the logic correctly. You understand what’s going on. And maybe if there’s uncertainty on the beta factors, you must do some sensitivity modelling on the tree with different beta factors. Or you run multiple models of the tree, but again, we’re now talking quantitative techniques with the fault tree, maybe, or semi-quantitative. We’re talking quite advanced techniques, where you would need a specialist who knows what they do in this area to come up with realistic results, that sensitivity analysis. The other thing you need to do is if the sensitivity analysis gives you an answer that you don’t want, you need to do something about that and not just file away the analysis report in a cupboard and pretend it never happened. (Not that that’s ever happened in real life, boys and girls, never, ever, ever. You see my nose getting longer? Sorry, let’s move on before I get sued.)

So other classic techniques. Zonal hazard analysis, it looks at lots of different components in a compartment. If component A blows up, does it take out everything else in that compartment? Or if the compartment floods, what functionality do we lose in there? And particularly good for things like ships and planes, but also buildings with complex machinery. Big plant where you’ve got different stuff in different locations. There’re also things called particular risk analysis where you think of, and these tend to be very unusual things where you think about what a fan blade breaks in a jet engine. Can the jet engine contain the fan blade failure? And if not, where you’ve got very high energy piece of metal flying off somewhere – where does that go? Does that embed itself in the fuselage of the aeroplane? Does it puncture the pressure hull of the aeroplane? Or, as has sadly happened occasionally, does it penetrate and injure passengers? So, things like that, usually quite unusual things that are all very domain or industry specific. And then there are common mode analysis techniques and a good example of a standard that incorporates those things is ARP 4761. This is a civil aircraft standard which looks at those things quite well, for example, there are many others.

Summary

In summary, I’ve emphasized the differences between Task 205 and 204. So, we might do a first pass 205 and 204 where we’re essentially doing the same thing just at different levels of granularity. So, we might do the whole system initially 205, one big hexagon, and then we might break down the jigsaw and do some 204 at a more detailed level. But where 205 is really going to score is in the differences between 204. So instead of just repeating, it’s valuable to repeat that analysis at a higher-level, but really if we go to diversify if we want success. So, we need to think about the different purpose and timing of these analyses. We need to think about what we’re going to get out of going top-down versus bottom-up, different sides of the ‘V’ model let’s say.

We need to think about the differences of looking at internals versus external interfaces and interactions, and we need to think of appropriate techniques and tools for all those things – and, of course, whether we need to do that at all! We will have an idea about whether we need to do that from all the previous analysis. So, if we’ve done our PHI or PHA, we’ve looked at the history and some simple functional techniques, and we’ve involved end-users and we’ve learnt from experience. If we’ve done our early tasks, we’re going to get lots of clues about how much risk is present, both in terms of the magnitude of the risk and the complexity of the things that we’re dealing with. So, clearly, we’ve got a very complex thing with lots of risks where we could kill lots of people, we’re going to do a whole lot more analysis than for a simple low-risk system. And we’re going to be guided by the complexity and risks and the hot spots where they are and go “Clearly, I’ve got a particular interface or particular subsystem, which is a hotspot for risk. We’re going to concentrate our effort there”. If you haven’t done the early analysis, you don’t get those clues. So, you do the homework early, which is quite cheap and that helps you. Direct effort, the best return on investment.

The Second major bullet point, which I talk about this again and again. That the client and end-user and/or the prime contractor need to do analysis early in order to get the benefits and to help them set requirements for lower down the hierarchy and pass relevant information to the sub-contractors. Because the sub-contractors, if you leave them in isolation, they’ll do a hazard analysis in isolation, which is usually not as helpful as it could be. You get more out of it if you give them more context. So really, the ultimate client, end-user, and probably the prime as well, both need to do this task, even if they’re subcontracting it to somebody else. Whereas, maybe the sub-system hazard analysis 204 could be delegated just down to the sub-system contractors and suppliers. If they know what they’re doing and they’ve got the data to do it, of course. And if they haven’t, there’s somebody further up the food chain on the supply chain may have to do that.

And lastly, 204 and 205 are complimentary, but not the same. If you understand that and exploit those similarities and differences, you will get a much more powerful overall result. You’ll get synergy. You’ll get a win-win situation where the two different analyses complement, reinforce each other. And you’re going to get a lot more success probably for not much more money and effort time. If you’ve done that thinking exercise and really sought to exploit the two together, then you’re going to get a greater holistic result.

Copyright

So, that’s the end of our session for today. Just a reminder that I’ve quoted from the Mil. Standard 882, which is copyright free, but the contents of this presentation are copyright Safety Artisan, 2020.

For More …

And for more lessons and more resources, please do visit www.safetyartisan.com and you can see the videos at www.patreon.com/safetyartisan.

End

That’s the end of the lesson on system hazard analysis task 205. And it just reminds me to say thanks very much for watching and look out for the next in the series of Mil. Standard 882 tasks. We will be moving on to Task 206, which is Operating and Support Hazard Analysis (OSHA), a quite different analysis to what we’ve just been talking. Well, thanks very much for watching and it’s goodbye from me.

The End

You can find a free pdf of the System Safety Engineering Standard, Mil-Std-882E, here.

Categories
Safe Design

Safe Design (Full)

Want some good guidance on Safe Design? In this 52-minute video from the Safety Artisan, you will find it. We take the official guidance from Safe Work Australia and provide a value-added commentary on it. The guidance integrates seamlessly with Australian law and regulations, but it is genuinely useful in any jurisdiction.

A free video on ‘Good Work Designis available here.

This is the three-minute demo of the full, 52-minute-long video.

Topics: Safe Design

  • A safe design approach;
  • Five principles of safe design;
  • Ergonomics and good work design;
  • Responsibility for safe design;
  • Product lifecycle;
  • Benefits of safe design;
  • Legal obligations; and
  • Our national approach.

Transcript: Safe Design

Click Here to Reveal the Transcript

Hello, everyone, and welcome to the Safety Artisan, where you will receive safety training via instructional videos on system safety, software safety and design safety. Today I’m talking about design safety. I’m Simon and I’m recording this on the 12th of January 2020, so our first recording of the new decade and let’s hope that we can give you some 20/20 vision. What we’re going to be talking about is safe design, and this safe design guidance comes from Safe Work Australia. I’m showing you some text taken from the website and adding my own commentary and experience.

Topics

The topics that we’re going to cover today are: a safe design approach, five principles of safe design, ergonomics (more broadly, its human factors), who has responsibility, doing safe design through the product lifecycle, the benefits of it, our legal obligations in Australia (but this is good advice wherever you are) and the Australian approach to improving safe design in order to reduce casualties in the workplace.

Introduction

The idea of safe design is it’s about integrating safety management, asset identification and risk assessment early in the design process to eliminate or reduce risks throughout the life of a product,  whatever the product is, it might be a building, a structure, equipment, a vehicle or infrastructure. This is important because in Australia, in a five-year period, we suffered almost 640 work-related fatalities, of which almost 190 were caused by unsafe design or design related factors contributed to that fatality., there’s an important reason to do this stuff, it’s not an academic exercise, we’re doing it for real reasons. And we’ll come back to the reason why we’re doing it at the end of the presentation.

A Safe Design Approach #1

First, we need to begin safe design right at the start of the lifecycle (we will see more of that later) because it’s at the beginning of the lifecycle where you’re making your bad decisions about requirements. What do you want this system to do? How do we design it to do that? What materials and components and subsystems are we going to make or buy in order to put this thing together, whatever it is? And thinking about how we are going to construct it, maintain it, operate it and then get rid of it at the end of life., there are lots of big decisions being made early in the life cycle. And sometimes these decisions are made accidentally because we don’t consciously think about what we’re doing. We just do stuff and then we realise afterwards that we’ve made a decision with sometimes quite serious implications. And, a big part of my day job as a consultant is trying to help people think about those issues and make good decisions early on when it’s still cheap, quick and easy to do. Because, of course, the more you’ve invested into a project, the more difficult it is to make changes both from a financial point of view and if people have invested their time, sweat and tears into a project, they get very attached to it and they don’t want to change it. there’s an emotional investment made in the project. the earlier you get in, at the feasibility stage let’s say, and think about all of this stuff the easier it is to do it. A big part of that is where is this kit going to end up? What legislation codes of practice and standards do we need to consider and comply with? So that’s the approach.

A Safe Design Approach #2

So, designers need to consider how safety can be achieved through the lifecycle. For example, can we design a machine with protective guarding so that the operator doesn’t get hurt using it, but also so the machine can be installed and maintained? That’s an important point as often to get at stuff we must take it apart and maybe we must remove some of those safety features. How do we then protect and maintain when the machine is maybe opened up, and the workings are things that you can get caught in or electrocuted by. And how do we get rid of it? Maybe we’ve used some funky chemicals that are quite difficult to get rid of. And Australia, I suspect like many other places, we’ve got a mountain of old buildings that are full of asbestos, which is costing a gigantic sum of money to get rid of safely. we need to design a building which is fit for occupancy. Maybe we need to think about occupants that are not able bodied or they’re moving stuff around in the building they don’t want to and need a trolley to carry stuff around. we need access, we need sufficient space to do whatever it is we need to do.

This all sounds simple, obvious, doesn’t it? So, let’s look at these five principles. First of all, a lot of this you’re going to recognise from the legal stuff, because the principles of safe design are very much tied in and integrated with the Australian legal approach, WHS, which is all good, all consistent and all fits together.

Five Principles of Safe Design

Principle 1: Persons with control. If you’re making a decision that affects design and products, facilities or processes, it is your responsibility to think about safety, it’s part of your due diligence (If you recall that phrase and that session).

Principle 2: We need to apply safe design at every stage in the lifecycle, from the very beginning right through to the end. That means thinking about risks and eliminating or managing them as early as we can but thinking forward to the whole lifecycle; sounds easy, but it’s often done very badly.

Principle 3: Systematic risk management. We need to apply these things that we know about and listen to other broadcasts from The Safety Artisan. We go on and on and on about this because this is our bread and butter as safety engineers, as safety professionals – identify hazards, assess the risk and think about how we will control the risks in order to achieve a safe design.

Principal 4: Safe design, knowledge and capability. If you’re controlling the design, if you’re doing technical work or you’re managing it and making decisions, you must know enough about safe design and have the capability to put these principles into practice to the extent that you need to discharge your duties. When I’m thinking of duties, I’m especially thinking of the health and safety duties of officers, managers and people who make decisions. You need to exercise due diligence (see the Work Health and Safety lessons for more about due diligence).

Principle 5: Information transfer. Part of our duties is not just to do stuff well, but to pass on the information that the users, maintainers, disposers, etc will need in order to make effective use of the design safely. That is through all the lifecycle phases of the product.

So those are the five principles of safe design, and I think they’re all obvious really, aren’t they? So, let’s move on.

A Model for Safe Design

As the saying goes, is a picture is worth a thousand words. Here is the overview of the Safe Design Model, as they call it. We’ve got activities in a sequence running from top to bottom down the centre. Then on the left we’ve got monitor and review, that is somebody in a management, controlling function keeping an eye on things. On the right-hand side, we need to communicate and document what we’re doing. And of course, it’s not just documentation for documentation sake, we need to do this in order to fulfil our obligations to provide all the necessary information to users, etc. that’s the basic layout.

If we zoom in on the early stage, Pre-Design, we need to think about what problem are we trying to solve? What are we trying to do? What is the context for our enterprise? And that might be easy if you’re dealing with a physical thing. If you build a car, you make cars to be driven on the road. there’ll be a driver and maybe passengers in the car, there’ll be other road users around you, pedestrians, etc. with the physical system, it’s relatively easy with a bit of imagination and a bit of effort to think about who’s involved. But of course, not just use, but maintenance as well. Maybe it’s got to go into a garage for a service etc, how do we make things accessible for maintainers?

And then we move on to Concept Development. We might need to do some research, gather some information, think about previous systems or related systems and learn from them. We might have to consult with some people who are going to be affected, some stakeholders, who are going to be affected by this enterprise. We put all of that together and use it to help us identify hazards. Again, if we’re talking about a physical system, say if you make a new model of car, it’s probably not that different from the previous model of a car that you designed. But of course, every so often you do encounter something that is novel, that hasn’t been done before, or maybe you’re designing something that is virtual like software and software is intangible and with intangible things it’s harder to do this. It requires a bit more effort and more imagination. It can be done, don’t be frightened of it but it does require a bit more forethought and a bit more effort and imagination than is perhaps the case with something simple and familiar like a car.

Moving on in the life cycle we have Design Options. We might think about several different solutions. We might generate some solutions; we might analyse and evaluate the risks of those options before selecting which option we’re going to go with. This doesn’t happen in reality very often, because often we’re designing something that’s familiar or people go, well actually I’m buying a bunch of kit off the shelf, (i.e. a bunch of components) and I’m just putting them together, there is no optioneering to do. That’s actually incorrect, because very often people do optioneering by default, in that they buy the component that is cheap and readily available, but they don’t necessarily say, is the supplier going to provide the safety data that I need that goes along with this component? And maybe the more reputable supplier that does do that is going to charge you more. you need to think about where are you going to end up with all of this and evaluate your options accordingly. And of course, if you are making a system that is purely made from off-the-shelf components, there’s not a lot of design to do, there is just integration.

Well that pushes all your design decisions and all your options much earlier on in the lifecycle, much higher up on the diagram as we see here. we are still making design options and design decisions, but maybe it’s just not as obvious. I’ve seen a lot of projects come unstuck because they just bought something that they like the look of, it appealed to the operators (If you live in an operator driven organisation, you’ll know what I mean).me people buy stuff, because they are magpies and it looks shiny, fun and funky. Then they buy it and people like me come along and start asking awkward questions about how are you going to demonstrate that this thing is safe to use and that you can put it in service? And then, of course, it doesn’t always end well if you don’t think about these things up front.

So, moving on to Design Synthesis. We’ll select a solution, put stuff together, work on controlling the risks in the system that we are building. I know it says eliminating and control risks, if you can eliminate risks that’s great, very often though you can’t. we have to deal with the risks that we cannot get rid of and there are usually some risks in whatever you’re dealing with.

Then we get to Design Completion, where we implement the design, where we put it together and see if it does come together in the real world as we envisaged it. That doesn’t always happen. Then we have got to test it and see whether it does what it’s supposed to do. We’re normally pretty good at testing it for that, because if you set out requirements for what it’s supposed to do, then you’ve got something to test against. And of course, if you’re trying to sell a product or service or you’re trying to get regulators or customers to buy into this thing, it’s got to do what you said it’s going to do. there’s a big incentive to test the thing to make sure it does what it should do. We’re not always so good at testing it to make sure that it doesn’t do what it shouldn’t do. That can be a bigger problem space depending on what you’re doing. And that is often the trick and that’s where safety get involved. The Requirements Engineers, Systems Engineers are great at saying, yeah, here’s the requirements test against the requirements. And then it’s the safety people that come along and say, oh, by the way, you need to make sure that it doesn’t blow up, catch fire, get so hot that you can burn people. You need to eliminate the sharp edges. You need to make sure that people don’t get electrocuted when operating or maintaining this thing or disposing of it. You must make sure they don’t get poisoned by any chemicals that have been built into the thing. Even thinking about if I had an accident in the vehicle, or whatever it is that has been damaged or destroyed, and I’ve now got a debris spread across the place, how do we clear that up? For some systems that can be a very challenging problem.

Ergonomics & Work Design

So, we’re going to move on now to a different subject, and a very important subject in safe design. I think this is one of the great things about safe design and good work design in Australia – that it incorporates ergonomics. we need to think about the human interaction with the system, as well as the technical design itself, and I think that’s very important. It’s something that is very easy, especially for technical people, to miss. As engineers some of us love diving into the detail, that’s where we feel comfortable, that’s what we want to do, and then maybe we miss sometimes the big picture – somebody is actually going to use this thing and make it work. we need to think about all of our workers to make sure that they stay healthy and safe at work. We need to think about how they are going to physically interact with the system, etc. It may not be just the physical system that we’re designing, but of course, the work processes that go around it, which is important.

It is worth pointing out that in the UK I’m used to a narrow definition of ergonomics. I’m used to a definition of ergonomics that’s purely about the physical way that humans interact with the system. Can they do so safely and comfortably? Can they do repetitive tasks without getting injured? That includes anthropomorphic aspects, where we think about the variation in size of human beings, different sex’s, different races. Also, how do people fit in the machine or the vehicle or interact with it. But, the way that in Australia we talk about ergonomics is it’s a much bigger picture than that. I would say don’t just think about ergonomics, think about human factors. It’s the science of people working. let’s understand human capabilities and apply that knowledge in the design of equipment and tools and systems and ways of working that we expect the human to use. Humans are pretty clever beasts in many ways and we’re still very good at things that a lot of machines are just not very good at. we need to design stuff which compliments the human being and helps the human being to succeed, rather than just optimising technical design in isolation. And this quotation is from the ARPANSA definition because it was the best one that I could find in Australia. I will no doubt talk about human factors another time in some depth.

Responsibilities

Under the law, (this is tailored for Australian law, but a lot of this is still good principles that are applicable anyway) different groups and individuals have responsibilities for safe design. Those who manage the design and the technical stuff directly and those who make resourcing decisions. For example, we can think about building architects, industrial designers, drafts people who create the design. Individuals who make design decisions at any lifecycle phase, that could be a wide range of people and of course not just technical people, but stakeholders who make decisions about how people are employed, how people are to interact with these systems, how they are to maintain it and dispose of it, etc. And of course, work health and safety professionals themselves. there’s a wide range of stakeholders involved here potentially. Also, anybody who alters the design, and it may be that we’re talking about a physical alteration to design or maybe we’re just using a piece of kit in a different context. we’re using a machine or a process or piece of software that was designed to do X, and we’re actually using it to do Y, which is more common than you might think. if we are not putting an existing design in a different context, which was never envisaged by the original designers, we need to think about the implications of both the environment on the design and the design on the environment and the human beings mixed up working in both. There’s a lot of accidents caused by modifying bits of kit, some might call it a signature accident in the U.K.: the Flixborough Chemical Plant explosion. That was one of the things that led to the creation of Molten Health and Safety in the UK and that was caused by people modifying a design and not fully realising the implications of what they were doing. Of course, the result was a gigantic explosion and lots of dead bodies. Hopefully it won’t always be so dramatic the things that we’re looking at, but nevertheless, people do ask designs to do some weird stuff.

If we want safe design, we can get it more effectively and more efficiently when people get together who control and influence outcomes and who make these decisions so that they collaborate on building safety into the design rather than trying to add on afterwards, which in my experience never goes well. We want to get people together, think about these things up front where it’s maybe a desktop exercise or it’s a meeting somewhere. It requires some people to attend the meeting and prepare for it and so on, and we need records, but that’s cheap compared to later design stages. When we’re talking about industrial plants or something that’s going to be mass produced, making decisions later is always going to be more costly and less effective and therefore it’s going to be less popular and harder to justify. get in and do it early while you still can. There’s some good guidance on all this stuff on who is responsible.

There’s the Principles of Good Work Design Handbook, which is created by Safe Work Australia and it’s also on the Safety Artisan Website (I gained permission to reproduce that) and there’s a model Code of Practice for safe design of structures. There was going to be a model Code of Practice for the safe design of plants, but that never became a Code of Practice, that’s just guidance. Nevertheless, there is a lot of good stuff in there. And there’s the Work, Health and Safety Regulations. And incidentally, there’s also a lot of good guidance on Major Hazard Facilities available. Major Hazard Facilities are anywhere where you store large amounts of potentially dangerous chemicals. However, the safety principles that are in the guidance for the MHF is very good and is generally applicable not just for chemical safety, but for any large undertaking where you could hurt a lot of people on that. MHF guidance I believe was originally inspired by the COMAH regulations in the UK, again which came from a major industrial disaster, Piper Alpha platform in the North Sea which caught fire and killed 167 people. It was a big fire. if you’ve got an enterprise where you could see a mass casualty situation, you’ll get a lot more guidance from the MHF stuff that’s really aimed at preventing industrial sized accidents. there’s lots of good stuff available to help us.

Design for Plant

So, examples of things that we should consider. We need to, (and I don’t think this will be any great surprise to you) think about all phases of the life cycle, I think we banged on about that enough. Whether it be plant (waste plant in this case), whatever it might be, from design and manufacture or right through to disposal. Can we put the plant together? Can we erect it or what structure and we install it? Can we facilitate safe use again? Again, thinking about the physical characteristics of your users, but not just physical, think about the cognitive thinking of your users. If we’re making control system, can the users use it to practically to exploit the plant for the purpose it was meant for whilst staying safe? What can the operator actually do, what can we expect them to perform successfully and reliably time after time because we want this stuff to keep on working for a long, long time, in order to make money or to do whatever it is we want to do. And we also need to think about the environment in which the plant will be used – very important.

Some more examples. Think about intended use and reasonably foreseeable misuse. If you know that a piece of kit tends to get misused for certain things, then either design against it or give the operator a better way of doing it. A really strange example, apparently the designers of a particular assault rifle knew that the soldiers tended to use a bit of the rifle as a can opener or to do stuff like that or to open beer bottles, so they incorporated a bottle opener in the design so that the soldiers would use that rather than damage the assault rifle opening bottles of beer. A crazy example there but I think it’s memorable. we have to consider by law intended use, if you go to the W.H.S lesson, you’ll see that’s written right through the duties of everybody, reasonably foreseeable misuse I don’t think is a hard requirement in every case, but it’s still a sensible thing to do.

Think about the difficulties that workers might face doing repairs or maintenance? Again, sorry, I banged on about that, I came from a maintenance world originally, so I’m very used to those kinds of challenges. And consider what could go wrong. Again, we’re getting back into classic design safety here. Think about the failure modes of your plant. Well, ideally, we always wanted a fail-safe, but if we can’t do that, well, how can we warn people? How can we make sure we minimise the risk if something goes wrong and if a foreseeable hazard occurs? And by foreseeable, I’m not just saying, well we shut ourselves in a darkened room and we couldn’t think of anything, we need to look at real world examples of similar pieces of kit. Look at real world history, because there’s often an awful lot of learning out there that we can exploit, if only we bother to Google it or look it up in some way. As I think it was Bismarck, the great German leader said only a fool learns from his own mistakes, a wise man learns from other people’s mistakes. that’s what we try and do in safety.

Product Lifecycle

Moving onto life cycle, this is a key concept. Again, we gone on and on and on about this. We need to control risks, not just for use, but during construction and manufacture in transit, when it’s being commissioned and tested and used and operated, when it’s being repaired, maintained, cleaned or modified. And then at the end of, I say the end of life, it may not be end of life when it’s being decommissioned. maybe a decommissioning kit, moving it to a new site or maybe a new owner has bought it. we need to be able to safely take it upon move and put it back together again. And of course, by that stage, we may have lost the original packaging. we may have to think quite carefully about how we do this, or maybe we can’t fully disassemble it as we did during the original installation. maybe we’ve got to move an awkward bit of kit around. And then at the end of life, how are we going to dismantle it or demolish it? Are we going to dispose of it, or ideally recycle it? Hopefully if we haven’t built in anything too nasty or too difficult to recycle, we can do that. that would be a good thing.

It’s worth reminding ourselves, we do get a safer product that is better for downstream users if we eliminate and minimise those hazards as early as we can. as I said before, in these early phases, there’s more scope in order to design out stuff without compromising the design, without putting limitations on what it can do. Whereas often when you’re adding safety in, so often that is achieved only at a cost in terms of it limits what the users can do or maybe you can’t run the plant at full capacity or whatever it might be, which is clearly undesirable. designers must have a good understanding of the lifecycle of their kit and so do those people who will interact with it and the environment in which it’s used. Again, if you’ve listened to me talking about our system safety concepts we hammer this point about it’s not just the plant it’s what you use it for, the people who will use it and the environment in which it is used. especially for complex things, we need to take all those things into account. And it’s not a trivial exercise to do this.

Then thirdly, as we go through the product life cycle, we may discover new risks, and this does happen. People make assumptions during the concept and design phase and fair enough you must make assumptions sometimes in order to get anything done. But those assumptions don’t always turn out to be completely correct or something gets missed, we missed some aspect often. It’s the thing you didn’t anticipate that often gets you. as we go through the lifecycle, we can further improve safety if people who have control over decisions and actions that are taken incorporate health and safety considerations at every stage and actually proactively look at whether we can make things better or whether something has occurred that we didn’t anticipate and therefore that needs to be looked into. Another good principle that doesn’t always happen, we shouldn’t proceed to the next stage in the life cycle until we have completed our design reviews, we have thought about health and safety along with everything else, and those who are control have had a chance to consider everything together. And if they’re happy with it, to approve it and it moves on. it’s a very good illustration. Again, it will come as no surprise to many listeners there are a lot of projects out there that either don’t put in design reviews at all or you see design reviews being rushed. Lip service is paid to them very often because people forget the design reviews are there to review the design and to make sure it’s fit for purpose and safe and all the other good things, and we just get obsessed with getting through those design reviews, whether we’re the purchaser, whether we’re just keen to get on with the job and the schedule must be maintained at all costs. Or if you’re the supplier, you want to get through those reviews because there’s a payment milestone attached to them. there’s a lot of temptation to rush these things. Often, rushing these things just results in more trouble further downstream. I know it takes a lot of guts, particularly early in a project to say, no, we’re not ready for this design review, we need to do some more work so that we can get through this properly. That’s a big call to make, often because not a lot of people are going to like you for making that call, but it does need to.

Benefits of Safe Design

So, let’s talk about the benefits. These are not my estimates, these are Safe Work Australia’s words, so they feel that from what they’ve seen in Australia and now surveying safety performance elsewhere, I suspect as well, that building safety into a plant can save you up to 10 percent of its cost. Whether it be through, an example here is reductions in holdings of hazardous materials, reduce need for personal protective equipment, reduce need filled testing and maintenance, and that’s a that’s a good point. Very often we see large systems, large enterprises brought in to being without sufficient consideration of these things, and people think only about the capital costs of getting the kit into service. Now, if you’re spending millions or even possibly billions on a large infrastructure project, of course you will focus on the upfront costs for that infrastructure. And of course, you are focused on getting that stuff into service as soon as possible so you can start earning money to pay for the capital costs of it. But it’s also worth thinking about safety upfront. A lot of other design disciplines as well, of course, and making sure that you’re not building yourself a life cycle, a lifetime full of pain, doing maintenance and testing that, to be honest, you really don’t want to be doing, but because you didn’t design something out, you end up with no choice. And so, we can hopefully eliminate or minimise those direct costs with unsafe design, which can be considerable rework, compensation, insurance, environmental clean-up. You can be sued by the government for criminal transgressions and you can be sued by those who’ve been the relatives of the dead, the injured, the inconvenienced, those who’ve been moved off their land. And these things will impact on parties downstream, not the designers. And in fact, often but not always, just the those who bought product and used it. there’s a lot of incentive out there to minimise your liability and to get it right up front and to be able to demonstrate they got it right up front. Particularly if you’re a designer or a manufacturer and you’re worried that some of your users are maybe not as professionals and conscientious using your stuff as you would like because it’s still got your name and your company logo plastered all over it.

I don’t think there’s anything new in here. there’s many benefits or we see the direct benefits. We’ve prevented injury and disease and that’s good. Not just your own, but other peoples. We can improve usability, very often if you improve safety through improving human factors and ergonomics, you’re going to get a more usable product that people like using, it is going to be more popular. Maybe you’ll sell more. You’ll improve productivity. those who are paying for the output are happy. You’ll reduce costs, not only reduce costs, (through life I’m talking about you might have to spend a little bit more money upfront), we can actually better predict and manage operations because we’re not having so many outages due to incidents or accidents. Also, we can demonstrate compliance with legislation which will help you plug the kit in the first place, but also it is necessary if you’re going to get past a regulator or indeed if you don’t want to get sent to jail for contravening the WHS Act. And benefits, well, innovation. I have to say innovation is a double-edged sword because some places love innovation, you’ll be very popular if you innovate. Other industries hate innovation and you will not be popular if you innovate. That last bullet, I’m not so sure it’s about innovation. Safety design, I don’t necessarily think it demands new thinking, it just demands thinking. Because most things that I’ve seen that are preventable, that have gone wrong and could have been stopped, it only required a little bit of thought and a little bit of imagination and a little bit of learning from the past, not just innovating the future.

Legal Obligations

So that brings us neatly on to think about our legal obligations. In Australia, and in other countries, there will be similar obligations, work, health and safety law impose duties on lots of people from designers, manufacturers, importers, suppliers, anybody who puts the stuff together, puts it up, modifies it, disposes of it. These obligations, as it says, will vary dependent on state or territory or whether Commonwealth W.H.S applies. But if it’s WHS, it’s all based on the model WHS from SafeWork Australia, so it will be very similar. In the W.H.S lesson, I talk about what these duties are and what you must do to meet them. You will be pleased to know that the guidance in safe design is in lock step with those requirements. this is all good stuff, not because I’m saying it but because I’m actually showing you what’s come out of the statutory authority.

Yes, these obligations may vary, we talk about that quite a lot and in other sessions. Those who make decisions, and not just technical people, but those who control the finances, have duties under WHS law. Again, go and see the WHS lesson than the talks about the duties, particularly the duties of senior management officers and due diligence. And there are specific safety due diligence requirements in WHS, which are very well written, very easy to read and understand. there’s no excuse for not looking at this stuff, it is very easy to see what you’re supposed to do and how to stay on the right side of the law. And it doesn’t matter whether you’re an employer, self-employed, if you control a workplace or not, there are duties on designers upstream who will never go near the workplace that the kit is actually used in. if a client has some building or structure designed and built for leasing, they become the owner of the building and they may well retain health and safety duties for the lifetime of that building if it’s used as a workplace or to accommodate workers as well.

Recap

I just want to briefly recap on what we’ve what we’ve heard. Safe design, I would say the big lesson that I’ve learned in my career is that safe design is not just a technical activity for the designers. I’ve worked in many organisations where the pedigree, the history of the organisation was that you had. technical risks were managed over here, and human or operational risks well managed over here, and there was a great a gulf between them, they never they never interacted very much. There was a sort of handover point where they would chuck the kit over the wall to the users and say, there, get on with it, and if you have an accident, it’s your fault because you’re stupid and you didn’t understand my piece of kit. And similarly got the operator saying all those technical people, they’ve got no idea how we use the kit or what we’re trying to do here, the big picture, they give us kit that is not suitable or that we have to misuse in order to get it to do the job.

So, if you have these two teams, players separately not interacting and not cooperating, it’s a mess. And certainly, in Australia, there is very explicit requirements in the law and regulations and the whole code of practice on consultation, communication and cooperation. these two units have got to come together, these two sides of the operation have got to come together in order to make the whole thing work. And WHS law does not differentiate between different types of risk. There is just risk to people, so you cannot hide behind the fact that, well I do technical risk I don’t think about how people will use it, you’ve just broken the law. you’ve got to think about the big picture, and you know, we can’t keep going on and on in our silence. A little bit of heart to heart, but that really, I think, is the value add from all of this. The great thing about this design guidance is that it encourages you to think through life, it encourages you to think about who is going to use it and it encourages you to think about the environment. And you can quite cheaply and quite quickly, you could make some dramatic improvements in safety by thinking about these things.

I’ve met a lot of technical people, if a risk control measure isn’t technical, if it isn’t highly complicated and involves clever engineering, then some people have got a tendency to look down their nose at it. What we need to be doing is looking at how we reduce risk and what the benefits are in terms of risk reduction, and it might be a really very simple thing that seems almost trivial to a technical expert that actually delivers the safety, and that’s what we’ve got to think about not about having a clever technical solution necessarily. If we must have a clever technical solution to make it safe, well, so be it. But, we’d quite like to avoid that most of the time if we can.

Australian Approach

Just to bring this to a close. Australia in the 10 years to 2022 has got certain targets. we’ve got seven national action areas, and safe by design or safe design is one of them. As I’ve said several times, Australian legislation requires us to consult, cooperate and coordinate, so far as is reasonably practicable. And we need to work together rather than chuck problems over the wall to somebody else. You might think you delegated responsibility to somebody else, but actually if you’re an officer of the person or conducting the business or undertaking, then you cannot ditch all of your responsibilities, so you need to think very carefully about what’s being done in your name because legally it can come back to you. you can’t just assume that somebody else is doing it and will do a reasonable job, it’s your duty to ensure that it is done, that you’ve provided the resources and that it is actually happening.

And so, what we want to do, in this 10-year period, is we want to achieve a real reduction, 30% reduction in serious injuries nationwide in that 10-year period and reduce work related fatalities by at least a fifth. these are specific and valuable targets, they’re real-world targets. This is not an academic exercise, it’s about reducing the body count and the number of people who end up in a hospital, blinded or missing limbs or whatever. it’s important stuff. And as it says, SafeWork Australia and all the different regulators have been working together with industry unions and special interest groups in order to make this all happen. that’s all excellent stuff.

End: Safe Design

And it just remains for me to say that most of the text that I’ve shown you is from the Safe Work Australia website, and that’s been reproduced under license, under Creative Commons license. You can see the full details on the safetyartisan.com website. And just to point out that the words, this presentation itself are copyright of The Safety Artisan and I just realised I drafted this in 2019, it’s copyright 2020, but never mind, I started writing this in 2019.

Now, if you want more videos please subscribe to the Safety Artisan channel on YouTube.  And that is the end of the presentation, so thank you very much for listening and watching and from the safety artisan, I just say, I wish you a successful and safe 2020, goodbye.

Categories
Work Health and Safety

Guide to the WHS Act

This Guide to the WHS Act covers many topics of interest to system safety and design safety specialists, this full-length video explains the Federal Australian Work Health and Safety (WHS) Act (latest version, as of 14 Nov 2020). Brought to you by The Safety Artisan: professional, pragmatic, and impartial.

This is the four-minute demo of the full, 44-minute-long video.

Recap: In the Short Video…

which is here, we looked at:

  • The Primary Duty of Care; and
  • Duties of Designers.

Topics: Guide to the WHS Act

In this full video we will look at much more…

  • § 3, Object [of the Act];
  • § 4-8, Definitions;
  • § 12A, Exclusions;
  • § 18, Reasonably Practicable;
  • § 19, Primary Duty of Care;
  • § 22-26, Duties of Designers, Manufacturers, Importers, Suppliers & those who Install/Construct/Commission;
  • § 27, Officers & Due Diligence;
  • § 46-49, Consult, Cooperate & Coordinate;
  • § 152, Function of the Regulator; and
  • § 274-276, WHS Regulations and CoP.

Transcript: Guide to the WHS Act

Click here for the Transcript

Hi everyone and welcome to the Safety Artisan where you will find instructional videos like this one with professional, pragmatic and impartial advice which we hope you enjoy. I’m Simon and I’m recording this on the 13th of October 2019. So today we’re going to be talking about the Australian Federal Work Health and Safety Act and call it an unofficial guide or system or design safety practitioners whatever you want to call yourselves because I’m looking at the WHS Act from the point of view of system safety and design safety.

 As opposed to managing the workplace although it does that as well. Few days ago, I recorded a short video version of this and in the short video we looked at the primary duty of care and the duty particularly we look at the duty of designs. And so, we spent some time looking at that and that video is available on the freight on petrol on the safety artisan page at Patreon.com. It’s available at safetyartisan.com and you can watch it on YouTube. So just search for safety artisan on YouTube.

Topics

So, in this video, we’re going to look at much more than that. I say selected topics we’re not going to look at everything in the WHS Act as you can see there are several hundred sections of it. We’ll be here all day. So, what we’re going to look at are things that are relevant to systems safety to design safety. So, we look very briefly at the object of the act, at what it’s trying to achieve. Just one slight of definitions because there’s a lot of exclusions because the Act doesn’t apply to everything in Australia.

 We’re going to look at the Big Three involved. So really the three principles that will help us understand what the act is trying to achieve is:

  • what is reasonably practicable. That phrase that I’ve used several times before.
  • What is the primary duty of care so that sections 18 and 19. And if we jump to
  • Section 27 What are or who are officers and what does due diligence mean in a WHS setting?

So, if I step back one section 22 to 26 you know the duties of various people in the supply chain.  We cover that in the short session. So, go ahead and look at that and then moving on. There are requirements for duty holders to consult cooperate and coordinate and then a brief mention of the function of the regulator. And finally, the WHS Act enables WHS regulations and codes of practice. So we’re just mentioned that so those are the topics we’re going to cover quite a lot to get through. So that’s critical.

Disclaimer

So, first this is a disclaimer from the website from the federal legislation site and it does remind people looking at the site that the information put up there is for the benefit of the public and it’s free of charge.

 So, when you’re looking at this stuff you need to look at the relevance of the material for your purposes. OK, I’m looking at the Web site it is not a substitute for getting legal or appropriate professional advice relevant to your particular circumstances. So quick disclaimer there. This is just a way a website with general advice I think we’ll get we’ll get them and hence this video is only as good as the content that’s being present okay.

The Object of the Act

So, the object of the act then as you can say I’m quoting from it because I’m using quotation marks, so the main object of the act is to provide a balanced and nationally consistent framework for the health and safety of workers and workplaces.

 And that’s important in Australia because Australia is a federated state. So, we’ve got states and territories and we’ve got the federal government or the Commonwealth as it’s usually known and the laws all those different bodies do not always line up. In fact, sometimes it seems like the state and territories delight in doing things that are different from each other and different from the Commonwealth. And that’s not particularly helpful if you’re trying to you know operate in Australia as a corporation or you know you’re trying to do something big and trying to invest in the country.

 So, the WHS act of a model WHS Act was introduced to try and harmonize all this stuff. And you’ll see some more about that on the website. By the way and I’ve missed out on some objectives. As you can see, I’m not doing one subset B to H go to have a look at it online. But then in Section 2 The reminder is the principle of giving the highest level of protection against harm to workers and other persons as is reasonably practicable. Wonderful phrase again which will come back to okay.

Definitions

 Now there are lots of definitions in the act. And it’s worth having a look at them particularly if you look at the session that I did on system safety concepts, I was using definitions from the UK standard. Now I did that for a reason because that set of definitions was very well put together. So it was ideal for explaining those fundamental concepts where the concepts in Australia WHS are very different so if you are operating in Australian jurisdiction or you want to sell into an Australian jurisdiction do look at those definitions and actually being aware of what the definitions are will actually save you a lot of hassle in the long run.

 Now because we’re interested systems safety practitioners of introducing complex systems into service. I’ve got the definitions here of plant structure and substance. So basically, plant is any machinery equipment appliance container implement or to any component of those things and anything fitted or connected to any of those things. So, they go going for pretty a pretty broad definition. But bearing in mind we’re talking about plants we’re not talking about consumer goods. We’re not talking about selling toasters or electric toothbrushes to people. OK. There’s other legislation that covers consumer goods.

 Then when it comes to structure again, we’ve got anything that is constructed be fixed or movable temporary or permanent. And it might include things on the ground towers and masks underground pipelines infrastructure tunnels and mining any components or parts thereof. Again, a very broad definition and similarly substance any natural or artificial substance in whatever form it might be. So again, very broad and as you might recall from the previous session a lot of the rules for designers’ manufacturers, importers and suppliers cover plant structure and substances. So hence that’s why I picked just those three definitions out of the dozens there.

Exclusions

 It’s worth mentioning briefly exclusions: what the Act does not apply to. So, first, the Act does not apply to commercial ships basically. So, in Australia, the Federal legislation covering the safety of people in the commercial maritime industry is the Occupational Health and Safety Act (Maritime Industry) 1993, which is usually known as “OSHMI” applies to commercial vessels, so WHS does not. And the second exclusion is if you are operating an offshore petroleum or greenhouse gas storage platform and I think it’s more than three nautical miles offshore.

 But don’t take my word for that if you’re in that business go and check with the regulator NOPSEMA then this act the Offshore Petroleum and Greenhouse Gas Storage Act 2006 applies or OPGGS for short. So, if you’re in the offshore oil industry then you’ve got a separate Commonwealth act plot but those are the only two exceptions. So, where Commonwealth law applies the only things that WHS. does not apply to is commercial ships and offshore platforms I mentioned state and territory vs. Commonwealth. All the states and territories have adopted the model WHS system except Victoria which so far seems to be showing no interest in adopting WHS.

 Thanks, Victoria, for that. That’s very helpful! Western Australia is currently in process of consultation to adopt WHS, but they’ve still got their current OH&S legislation. So just note that there are some exclusions there. OK so if you’re in those jurisdictions then WHS does not apply. And of course, there are many other pieces of legislation and regulation that cover particular kinds of risk in Australia. For example, there’s a separate act called ARPANS that covers ionizing a non-ionizing radiation.

There are many other acts that cover safety and environmental things. Let’s go back one when I’m talking about those specific acts. They only apply to specific things whereas WHS act is a general Act applies to everything except those things that it doesn’t like to write move on.

So Far As is Reasonably Practicable

Okay now here we come to one of these three big ticket items and I’ve got two slides here. So, in this definition of reasonably practicable when it comes to ensuring health and safety reasonably practicable means doing what you are reasonably able to do to achieve the high standards of health safety in place.

 Considering and weighing up all the relevant matters; including, say, the first two we need to think about the likelihood of a hazard or risk. How likely is this thing to occur this potential threat to human health? And what’s the degree of harm that might result from the hazard or risk. So, we’ve got a likelihood and degree of harm or severity. So, if we recall the fundamental definition of risk is that it’s though it’s the factor of those two things taken together. So, this first part we’re thinking about what is the risk?

 And it’s worth mentioning that hazard is not defined in the Act and risk is very loosely defined. So, the act is being deliberately very broad here. We’re not taking a position on or style of approach to describing risks, so to the second part.

Having thought about the risk now we should consider what the person PCBU or officer, whoever it might be, ought reasonably to know about the hazard or risk and the ways of eliminating or minimizing the risks. So, what we should know about the risk and the ways of dealing with it of mitigating it of controlling and then we’ve got some more detail on these ways of controlling the risk.

 We need to think about the availability and suitability of ways to eliminate or minimize the risk. Now I’m probably going to do a separate session on reasonably practicable because there is a whole guidebook on how to do it. So, we’ll go through that and at some stage in the future and go through that step by step about how you determine availability and suitability et cetera. And so, once you get into it it’s not too difficult. You just need to follow the guidelines which are very clear and very well laid out.

 So having done all of those things, after assessing the extent of the risk and the available ways of controlling it the we can then think about the cost associated with those risk controls and whether the cost of those controls is grossly disproportionate to the risk. As we will see later, in the special session, if the cost is grossly disproportionate to the risk reduction then it’s probably not reasonable to do it. So, you don’t necessarily have to do it but we will step back and just look at the whole thing.

So, in a and b we’re looking at the likelihood and severity of the risk so and we’re (quantifying or qualitatively) assessing the risk. We’re thinking about what we could do about it, how available and suitable are those risk controls, and then putting it all together. How much will it cost to implement those risk controls and how reasonably practicable to do so. So what we have here is basically a risk assessment process that leads us to a decision about which controls we need to implement in order to achieve that ‘reasonably practicable’ statement that you see in so many parts of the act and indeed it’s also in the definition itself.

 So, this is how we determine what is reasonably practicable. We follow a risk assessment process. There is a risk assessment Code of Practice, which I will do a separate session on, which gives you a basic minimum risk assessment process to follow that will enable us to decide what is reasonably practicable. Okay, quite a big topic there. And as I say we’ll come back and do a couple more sessions on how to determine reasonably practical, so moving on to the primary duty of care we covered in the short session.

The Primary Duty of Care

 So I’m not really going to go through this again [in detail] but basically our primary duty is to ensure so far as is reasonably practicable the health and safety of workers, whether we’ve engaged them whether we’ve got somebody else to engage them or whether we are influencing or directing people carrying out the work. We have a primary duty of care if we’re doing any of those things. And secondly, it’s worth mentioning that the person conducting a business or undertaking the PCBU must ensure the health and safety of other people. Say, visitors to the workplace are members of the public who happen to be near the workplace.

 And of course, bearing in mind that this law applies to things like trains and aircraft if you have an accident with your moving vehicle or your plant you could put people in danger – in the case of aeroplanes anywhere in Australia and beyond. So, it’s not just about the work, the workers in the workplace. With some systems, you’ve got a very onerous responsibility to protect the public depending on what you’re doing. Now for a little bit more detail that we didn’t have in the short session. When we say we must ensure health and safety we’re talking about the provision and maintenance of a safe work environment or safe plant structures or safe systems of work talking about safe use handling and storage of structures and substances.

 We’re talking about adequate facilities for workers that are talking about the provision of information, training, instruction or supervision. Those workers and finally the health of workers and conditions of the workplace are monitored if need be for the purpose of preventing illness or injury. So, there should be some general monitoring of health and safety-related incidents. And if you’re dealing with certain chemicals or are you intentionally exposing people to certain things you may have to conduct special monitoring looking for contamination or poisoning of those people whatever it may be. So, you’ve got quite a bit of detail there about what it means to carry out the primary duty of care.

 And this is all consistent with the duties that we’ve talked about on designers, manufacturers, importers, and suppliers and for all these things there are codes of practice giving guidance on how to do these things. So, this whole work health and safety system is well thought through, put together, in that the law says you’ve got to do this. And there are regulations and codes of practice giving you more information on how you can fulfil your primary directive and indeed how you must fulfill your primary duty.

 And then finally there’s a slightly unusual part for at the end and this covers the special case where workers need to occupy accommodation under the control of the PCBU in order to get the job done. So you could imagine if you need workers to live somewhere remote and you provided accommodation then there are requirements for the employer to take care of those workers and maintain those premises so that they not exposed to risks.

 That’s a big deal because she might have a remote plant, especially in Australia which is a big place and not very well populated. You might be a long way away from external help. So if you have an emergency on-site you’re going to have to provide everything (not just an emergency you need to do that anyway) but if you’ve got workers living remotely as often happens in Australia you’ve got to look after those workers in a potentially very harsh environment.

And then finally it’s worth mentioning that self-employed persons have got to take care of their own health and safety. Note that a self-employed person is a PCBU, so even self-employed people have a duty of care as a PCBU.

The Three Duties

OK, sections 22 to 26. Take that primary duty of care and elaborate it for designers and manufacturers, importers and suppliers and for those installing constructing or commissioning plant substances and structures. And as we said in the free session all of those roles all of the people BCBS is doing that have three duties they have to ensure safety in a workplace and that includes you know designing and manufacturing the thing and ensuring that it’s safe and meets Australian regulations and obligations.

 We have a duty to test which actually includes doing all the calculations analysis and examination that’s needed to demonstrate safety and then to provide needed information to everybody who might use or come into contact with the system so those three duties apply consistently across the whole supply chain. Now we spent some time talking about that. We’re going to move on OK, so we are halfway through. So, a lot to take in. I hope you’re finding this useful and enjoying this. Let’s move on. Now this is an interesting one.

Officers of the PCBU

Officers of the PCBU have additional duties and an officer of the PCBU might be a company director. That’s explicitly included in the definition. A senior manager somebody who has influence. Offices of the PCBU must exercise due diligence. So basically, the implied relationship is you’ve got a PCBU, you’ve got somebody directing work whether it be design work manufacturing operating a piece of kit whatever it might be. And then there are more senior people who are in turn directing those PCBUs (the officers) so the officers must exercise due diligence to ensure that the PCBUs comply with their duties and obligations.

Sections 2 to 4 cover penalties for offices if they fail. I’m not going to discuss that because as I’ve said elsewhere on the Safety Artisan website, I don’t like threatening people with penalties because I actually think that results in poor behavior, it actually results in people shirking and avoiding their duties rather than embracing them and getting on with it. If you frighten people or tell them what’s going to happen to them, they get it wrong. So, I’m not going to go there. If you’re interested you can look up the penalties for various people, which are clearly laid out. We move on to Section 5.

Due Diligence

 We’re now talking about what is due diligence in the context of health and safety. OK, I need to be precise because the term due diligence appears in other Australian law in various places meaning various things, but here this is the definition of due diligence within the WHS context. So, we’ve got six things to do in order to demonstrate due diligence.

So, officers must acquire and keep up to date with knowledge of work health and safety matters obligations and so forth. Secondly, officers must gain an understanding of the nature of the operations of the piece and risks they control.  So, if you’re a company director you need to know something about what the operation does. You cannot hide behind “I didn’t know” because it’s a legal requirement for you to do it. So that closes off a whole bunch of defenses in court. You can’t plead ignorance because ignorance is, in fact, illegal and you’ve got to have a general understanding of the hazards and risks associated with those operations. So, you don’t necessarily have to be up on all the specifics of everything going on in your organization but whatever it is that your organization does. You should be aware of the general costs and risks associated with that kind of business.

Now, thirdly, we are moving on basically C D E and F refer to appropriate resources and processes, so the officers have got to ensure that PCBUs have available and use appropriate resources and processes in order to control risks. OK so that says you’ve got to provide those resources and processes and there is supervision, or some kind of process or requirement to say, yep, we put in let’s say a safety management system that ensures people do actually use the stuff that they are supposed to use in order to keep themselves safe.

 And that’s very relevant of course because often people don’t like wearing, for example, protective personal protective equipment because it’s uncomfortable or slows you down, so the temptation is to take it off. Moving on to part D we’re still on the appropriate processes; we must have appropriate processes for receiving and considering information on incidents, hazards and risks. So again, we’ve got to have something in place that keeps us up to date with the incidents, hazards and risks in our own plants and maybe similar plants in the industry and, we need a process to respond in a timely way to that information.

 So, if we discover that there is a new incident or hazard that you didn’t previously know about. We need to respond and react to that quickly enough to make a difference to the health and safety of workers. So again as another that sort of works in concert with part B doesn’t it. In part A and B we need to keep up to date on the risks and what’s going on in the business and part A, we need to ensure that the PCBU has processes for compliance with any duty or obligation and follows them again to provide that stuff.

In the system safety world, often the designers will need to provide the raw material that becomes those processes. Or maybe if we’re selling the product, we sell a product with the instruction manual with all the processes that could be required.

And then finally the officers must verify the provision and use of these resources and processes that we’ve been talking about in C D an E. So, we’ve got a simple six-point program that comprises due diligence, but as you can see it’s very to the point and it’s quite demanding. There’s no shirking this stuff or pretending you didn’t know and it’s I suspect it’s designed to hang Company directors who neglect and abuse their workers and, as a result, harm happens to them.

But I mean ultimately let’s face it this is all good common-sense stuff. We should be doing this anyway. And in any kind of high-risk industry we should have a safety management system that does all of this and more. These are only the minimum required for all industries and all undertakings in Australia. OK let’s move away from the big stick. Let’s talk about some sort of cozy, softer stuff.

Consult, Cooperate and Coordinate

If you are a duty holder, if you’ve got a duty of care to people as a PCBU or an officer, you must consult, cooperate and coordinate your activities with all other offices and bases be used.

You have a duty in relation to the same matter. So perhaps you are a supplier of kit and you get information from the designer or the manufacturer with the updates on safety or maybe they inform you of problems with the kit. You must pass that on. Let’s imagine you’re introducing a complex system into service. There are going to be lots of different stakeholders, and you all must work together in order to meet WHS obligations. So, there’s no excuse or trying to ask the buck to other people.

That’s not going to work if you haven’t actively managed the risk, as you are potentially already doing something illegal and again, we won’t talk about the penalties of this. We’re just talking about the good things we’re expected to do. So, we’re trying to keep it positive. And you’ve got a duty to consult with your workers who either carry out work or who are likely to be directly affected by what’s going on and the risks. Now, this is a requirement that procedures in Sections 2 and 3, but of course we should be consulting with our workers because they’ve often got practical knowledge about controlling risks and what is available and suitable to do so, which we will find helpful.

So, consulting workers is not only a duty it’s actually a good way of doing business and doing business efficiently so moving on to section 152.

The Regulator

There are several sections about the regulator, but to my mind, they don’t add much. So, we’re just going to talk about Section 152, which is the functions of a regulator and the regulator has got several functions. So, they give advice and make recommendations to the relevant minister or Commonwealth Minister of the government. They monitor and enforce compliance with the act.

 They provide advice and information to duty holders and the community they collect analyse and publish statistics. They’re supposed to foster a co-operative, consultative relationship in the community to promote and support education and training and to engage in and promote and coordinate the sharing of information. And then finally they’ve got some legal duties with courts and industrial tribunals, and here’s the catch-all, any other function conferred on the regulator by the Act. If we look at the first six the ones that I’ve highlighted there are a number of regulators in Australia and because of the complexity of our federal government system, we’ve got.

 It’s not always clear which regulator you need to deal with and not all regulators are very good at this stuff. I have to say having worked in Europe and America and Australia, for example on Part D. Australian regulators are not very good at analyzing and publishing statistics in general. Usually, if you want high-quality statistics from a regulator, you’re usually better off looking at a European regulator in your industry or an American regulator. The Aussie ones don’t seem to be very good at that, in general.

There are exceptions. NOPSEMA, for example in the offshore world, are particularly good. But then you would expect because of the inherent dangers of offshore operations. Otherwise, I’ve not been that impressed with some of the regulators. The exception to that is Safe Work Australia. So, if you’re looking for advice and information, statistics, education and training and sharing of information then Safe Work Australia is your best bet. Now ironically Safe Work Australia is not a regulator.

Safe Work Australia

They are a statutory authority and they created, in consultation with many others I might say, they created a model WHS Act the model regulations and the Model Codes practice. So, if you go on their website you will find lots of good information on there and indeed I tend to look at that in order to find information to post on safety artisan. So, they’ve got some good WHS information on there. But of course, the wherever you go look at their site you must bear in mind that they are not the regulator of anything or anyone. So, for you’ve also got to go and look at the find the relevant regulator to your business or undertaking and you’ve got to look at what your regulator requires you to do.

 Very often when it comes to looking at guidance your best bet is safe work Australia okay.

Regulations and Codes of Practice

I’ve mentioned regulations and codes of practice. Basically, these sections of the act enable those codes of practice and regulations so the Minister has power to approve Commonwealth codes of practice and similarly state and territory ministers can do the same for their versions of WHS. This is very interesting and we’ll come back to relook at codes of practice in another session. An approved code of practice is admissible in court as evidence, it’s admissible as the test of whether or not a duty or obligation under the WHS Act has been complied with.

 And basically, the implication of this is that you are ignorant of codes of practice at your peril because if something goes wrong then codes of practice are what you will be judged against at minimum. So that’s a very important point to note and we’ll come back to that on another session.

Next, Codes of Practice and then regulation-making powers. For some unknown reason to me, the Governor-General may authorize regulations. I mean that doesn’t really matter. The codes of practice and the regulations are out there, and the regulations are quite extensive.  I think six hundred pages. So, there’s a lot of stuff in there. And again, we’ll do a separate session on WHS regulations soon OK.

That’s All Folks!

I appreciate we’ve covered quite a lot of ground there but of course, you can watch the video as many times as you like and go and look at the Act online. Mentioning that all the information I’ve shown you is pretty much word for word taken from the federal register of legislation and I’m allowed to do that under the terms of the license.

Creative Commons Licence

 And it’s one of those terms I have to tell you that I took this information yesterday on the 12th of October 2019. You should always go to that website to find the latest on Commonwealth legislation (and indeed if you’re working on it state or territory jurisdiction you should go and see the relevant regulator’s legislation on their site). Finally, you will find more information on copyright and attribution at the SafetyArtisan.com website, where I’ve reproduced all of the requirements, which you can check. At the Safety Artisan we’re very pleased to comply with all our obligations.

Now for more on this video, you may have seen it on Patreon on the Safety Artisan page or you may have seen it elsewhere, but it is for sure available Patreon.com/SafetyArtisan. Okay. So, thank you very much for listening and all that remains for me to do is to sign off and say thanks for listening and I look forward to presenting another session to you in a month’s time. Take care.

Back to the WHS Topic Page.

Categories
Mil-Std-882E Safety Analysis

Preliminary Hazard Analysis

In this 45-minute session, The Safety Artisan looks at Preliminary Hazard Analysis, or PHA, which is Task 202 in Mil-Std-882E. We explore Task 202’s aim, description, scope and contracting requirements. We also provide value-adding commentary and explain the issues with PHA – how to do it well and avoid the pitfalls.

This is the seven-minute-long demo video. The full video is 45 minutes’ long.

Topics: Preliminary Hazard Analysis

  • Task 202 Purpose;
  • Task Description;
  • Recording & Scope;
  • Risk Assessment (Tables I, II & III);
  • Risk Mitigation (order of preference);
  • Contracting; and
  • Commentary.

Transcript: Preliminary Hazard Analysis

Transcript: Preliminary Hazard Analysis

Hello and welcome to the Safety Artisan, where you’ll find professional, pragmatic and impartial safety training resources. So, we’ll get straight on to our session and it is the 8th February 2020. 

Preliminary Hazard Analysis

Now we’re going to talk today about Preliminary Hazard Analysis (PHA). This is Task 202 in Military Standard 882E, which is a system safety engineering standard. It’s very widely used mostly on military equipment, but it does turn up elsewhere.  This standard is of wide interest to people and Task 202 is the second of the analysis tasks. It’s one of the first things that you will do on a systems safety program and therefore one of the most informative. This session forms part of a series of lessons that I’m doing on Mil-Std-882E.

Topics for This Session

What are we going to cover in this session? Quite a lot! The purpose of the task, a task description, recording and scope. How we do risk assessments against Tables 1, 2 and 3. Basically, it is severity, likelihood and the overall risk matrix.  We will talk about all three, about risk mitigation and using the order of preference for risk mitigation, a little bit of contracting and then a short commentary from myself. In fact, I’m providing commentary all the way through. So, let’s crack on.

Task 202 Purpose

The purpose of Task 202, as it says, is to perform and document a preliminary hazard analysis, or PHA for short, to identify hazards, assess the initial risks and identify potential mitigation measures. We’re going to talk about all of that.

Task Description

First, the task description is quite long here. And as you can see, I’ve highlighted some stuff that I particularly want to talk about.

It says “the contractor” [does this or that], but it doesn’t really matter who is doing the analysis, and actually, the customer needs to do some to inform themselves, otherwise they won’t really understand what they’re doing.  Whoever does it needs to perform and document PHA. It’s about determining initial risk assessments. There’s going to be more work, more detailed work done later. But for now, we’re doing an initial risk assessment of identified hazards. And those hazards will be associated with the design or the functions that we’re proposing to introduce. That’s very important. We don’t need a design to do this. We can get in early when we have user requirements, functional requirements, that kind of thing.

Doing this work will help us make better requirements for the system. So, we need to evaluate those hazards for severity and probability. It says based on the best available data. And of course, early in a program, that’s another big issue. We’ll talk about that more later. It says including mishap data as well, if accessible: American term mishap, it means accident, but we’re avoiding any kind of suggestion about whether it is accidental or deliberate.  It might be stupidity, deliberate, whatever. It’s a mishap. It’s an undesirable event. We look for accessible data from similar systems, legacy systems and other lessons learned. I’ve talked about that a little bit in Task 201 lesson about that, and there’s more on that today under commentary. We need to look at provisions, alternatives, meaning design provisions and design alternatives in order to reduce risks and adding mitigation measures to eliminate hazards. If we can all reduce associated risk, we need to include all of that. What’s the task description? That’s a good overview of the task and what we need to talk about.

Reading & Scope

First, recording and scope, as always, with these tasks, we’ve got to document the results of the PHA in a hazard tracking system. Now, a word on terminology; we might call hazard tracking system; we might call it hazard log; we might call it a risk register. It doesn’t really matter what it’s called. The key point is it’s a tracking system. It’s a live document, as people say, it’s a spreadsheet or a database, something like that. It’s something relatively easy to update and change. And, we can track changes through the safety program once we do more analysis because things will change. We should expect to get some results and to refine them and change them as time goes on. Very important point.

Scope #1

Scope. Big section this. Let me just check. Yes, we’ve got three slides on scope. This does go on and on. The scope of the PHA is to consider the potential contribution from a lot of different areas. We might be considering a whole system or a subsystem, depending on how complex the thing is we’re dealing with. And we’re going to consider mishaps, the accidents and incidents, near misses, whatever might occur from components of the system (a. System components), energy sources (b. Energy sources), ordnance (c. Ordnance)- well that’s bullets and explosives to you and me, rockets and that kind of stuff.

Hazardous materials (d. Hazardous Materials (HAZMAT)), interfaces and controls (3. Interfaces and controls), interface considerations to other systems (f. Interface considerations to other systems when in a network or System-of-Systems (SoS) architecture), external systems. Maybe you’ve got a network of different systems talking to each other. Sometimes that’s called a system of systems architecture. Don’t worry about the definitions. Our system probably interacts and talks to other systems, or It relies on other systems in some way, or other systems rely on it. There are external interfaces. That’s the point.

Scope #2

We might think about material compatibilities (g. Material Compatibilities) – Different materials and chemicals are not compatible with others-, inadvertent activation (h. Inadvertent activation).

Now, I’ve highlighted I. (Commercial-Off-the-Shelf (COTS), Government-Off-the-Shelf (GOTS), Non-Developmental Items (NDIs), and Government-Furnished Equipment (GFE).) because it’s something that often gets neglected. We also need to think about stuff that’s already been developed. The general term is NDIs and it might be commercial off the shelf, it might be a government off the shelf system, or government-furnished equipment  GFE- doesn’t really matter what it is. These days, especially, very few complex systems are developed purely from scratch. We try and reuse stuff wherever we can in order to keep costs down and schedule down.

We’re going to need to integrate all these things and consider how they contribute to the overall risk picture. And as I say, that’s not often done well. Well, it’s hardly ever done well. It’s often not done at all. But it needs to be, even if only crudely. That’s better than nothing.

J. (j. Software, including software developed by other contractors or sources.  Design criteria to control safety-significant software commands and responses (e.g., inadvertent command, failure to command, untimely command or responses, and inappropriate magnitude) shall be identified, and appropriate action shall be taken to incorporate these into the software (and related hardware) specifications)  we need to include software, including software developed elsewhere. Again, that’s very difficult, often not done well. Software is intangible. If somebody else has developed it maybe we don’t have the rights to see the design, or code, or anything like that. Effectively it’s a black box to us. We need to look at software. I’m not going to bother going through all the blurb there.

Another big thing in part k (k.  Operating environment and constraints) is we need to look at the operating environment. Because a piece of kit that behaves in a certain way in one environment, you put it in a different environment and it behaves differently. And it might become much more dangerous. You never know. And the constraints that we put under on system. Operating environment is very big. And in fact, if you see the lesson I did on the definition of safety, we can’t really define whether a system is safe or not until we define the operating environment. It’s that important, a big point there.

Scope #3

And then third slide of three procedures (l. Procedures for operating, test, maintenance, built-in-test, diagnostics, emergencies, explosive ordnance render-safe and emergency disposal). Again, these are well these often don’t appear until later unless of course, we’ve gone off the shelf system. But if we have got off the shelf system; there should be a user manual, there should be maintenance manuals, there should be warnings and cautions, all this kind of stuff. So, we should be looking for procedures for all these things to see what we could learn from them. We want to think about the different modes (m. Modes) of operation of the system. We want to think about health hazards (n. Health hazards) to people, environmental impacts (o. Environmental Impacts), because they take to includes environmental.

We need to think about human factors, human engineering and human error analysis (p. Human factors engineering and human error analysis of operator functions, tasks, and requirements). And it says operator function tasks and requirements, but there’s also maintenance and disposal of storage. All the good stuff. Again, Human Factors is another big issue. Again, it’s not often done well, but actually, if you get a human factor specialist statement early, you can do a lot of good work and save yourself a lot of money, and time, and aggravation by thinking about things early on.

We need to think about life support requirements (q.  Life support requirements and safety implications in manned systems, including crash safety, egress, rescue, survival, and salvage). If the system is crewed or staffed in some way, I’m thinking about, well, ‘What happens if it crashes?’ ‘How do we get out?’ ‘How do we rescue people?’ ‘How do we survive?’ ‘How do we salvage the system?’

 Event-unique hazards (r. Event-unique hazards). Well, that’s kind of a capsule for your system does something unusual. If it does something unusual you need to think about it.

And then thinking about part s. infrastructure (s.  Built infrastructure, real property installed equipment, and support equipment), property installed equipment and support equipment in property and infrastructure.

And then malfunctions (t. Malfunctions of the SoS, system, subsystems, components, or software) of all the above.

I’m just going to whizz back and forth. We’ve got to sub-item T there. We’ve got an awful lot of stuff there to consider. Now, of course, this is kind of a hazard checklist, isn’t it? It’s sort of a checklist of things. We need to look at all that stuff. And in that respect, that’s excellent, and we should aim to do something on all of them just to see if they’re relevant or not if nothing else. The mistake people often make is because they can’t do something perfect and comprehensive, they don’t do anything. We’ve got a lot of things to go through here. And it’s much better to have a go at all these things early and do a bit of rough work in order to learn some stuff about our system. It’s much better to do that than to do nothing at all. And with all of these things, it may be difficult to do some of these things, the software, the COTS, things where we don’t have access to all the information, but it’s better to do a little bit of work early than to do nothing at all waiting for the day to arrive when we’ll be able to do it perfectly with only information. Because guess what? That day never comes! Get in and have a go at everything early, even if it’s only to say, ‘I know nothing about this subject, and we need to investigate it.’ That’s the pros and cons of this approach. Ideally, we need to do all these things, but it can be difficult.

Risk Assessment

Moving on. Well, we’ve looked to a broad scope of things for all the hazards that we identify and there are various techniques you can use. The PHA has go to include a risk assessment. That means that we’ve got to think about likelihood and severity and then that gives us an overall picture of risk when we combine the two together. That’s tables 1 and 2.

And then, forget risk assessment codes I’m not sure why that’s in there, table 3 is the risk matrix and 88 2 has a standard risk matrix. And it says use that unless you’ve got a tailored matrix for your system that’s been approved for use. And in this case, it says approved effectively in accordance with the US Department of Defence. But it’s whoever is the acquiring organization, the authority, the customer, the purchaser, whatever you want to call it, the end-user. We’ll talk about that more in a sec.

Table I, Severity

Let’s start by looking at severity, which in many ways is the easiest thing to look at. Now, here we’ve got in this standard we’ve got an approach based on harm to people, harm to the environment, and monetary loss due to smashing stuff up. At the top catastrophic accident. Category 1 is a fatal accident. This accident could result in death, permanent total disability, irreversible significant environmental impact, or monetary loss. And in this case, it says $10 million. Well, this, that’s 10 million US dollars. This standard was created in 2012, this version of the standard, probably inflation has had an effect since then. And a critical accident, we could cause partial disability injuries or occupational illness that can hospitalized three people are reversible. Significant environmental impact or some losses between 1 million and 10. And then we go down to marginal. Injury or hospital, lost workdays for one person, reversible moderate environmental impact or monetary loss between $100,000 and one million dollars. And then finally negligible is less than that. Negligible is an injury or illness that doesn’t result in any lost time at work, minimal environmental impact, or a monetary loss of less than a hundred thousand dollars. That’s easy to do in this standard. We just say, ‘What are our losses that we think could result?’ Worst case, reasonable scenario or an accident? That’s straightforward.

Table II, Probability

Now let’s look at probability. We’ve got a range here from ‘a’ to ‘e’, frequent down to improbable, and then F is eliminated. And eliminated in the standard really does mean eliminated. It cannot happen ever! It does not mean that we managed to massage the figures, the likelihood a probability figures, down Low that we pretend that it will never happen. It means that it is a physical impossibility. Please take note because I’ve seen a lot of abuse of that approach. That’s bad practices to massage the figures down to a level where you say, ’I don’t need to bother thinking about this at all!’ because the temptation is just to frig [massage] the figures and not really consider stuff that needs to be considered. Well, I’ll get off my soapbox now.

Let’s go back to the top. Frequent- you’ve said, for one item, likely to occur often. Down to probable- occur several times in the life of an item. Occasional- likely to occur sometimes, we think it’ll happen once in the life of an item. Remote- we don’t think it’ll happen at all, but it could do. And improbable – so unlikely for an individual item that we might assume that the occurrence won’t happen at all. But when we consider a fleet, particularly, I’ve got hundreds or thousands of items, the cumulative risk or cumulative probability, sorry, I should say, is unlikely to occur across the fleet, but it could.

And this is where this specific vs. fleet occurrence or probability is useful. For example, if we think ‘Let’s imagine a frequent hazard’. We think that something could happen to an item, per item, let’s say once a year. Now, if we’ve got a fleet of fifty of these items or fifty-something of these items, that means it’s going to happen across the fleet pretty much every week on average. That’s the difference. And sometimes it’s helpful to think about an individual system. And sometimes it’s helpful to think about a fleet where you can you’ve got relevant experience to say, ‘Well the fleet that we’re replacing. We had a fleet of 100 of these things. And we at this go, this went wrong every week or every month or once a year or only happened once every 10 years across the entire fleet.’ And therefore, we could reason about it that way.

We’ve got two different ways of looking at probability here. And use whichever one is more useful or helps you. But when we’re doing that, try and do that with historical data, not just subjective judgment. Because otherwise your subjective judgment, one individual might say ‘That will never happen!’, whereas another will say, ‘Well, actually we experienced it every month on our fleet!’. Circumstances are different.

Table III, Risk Matrix

We put severity and probability together. We have got 1 to 4 on severity, A to F on probability, and we get this matrix. We’ve got probability down the side and severity along the top. And in this standard, we’ve got high risk, serious risk, medium risk and low risk. And now how exactly you define these things is, of course, somewhat arbitrary. We’ll just look at some general principles.

The good thing about this risk matrix is- First, the thing to remember is that risk is the product of probability and severity. Effectively we multiply the two together and we go, well, if we’ve got a catastrophic or a critical risk. And it’s if we’ve got a more serious risk and it’s going to happen often that’s a big risk. That’s a that’s a high risk. Where alternatively, if we’ve got a low severity accident that we think will happen very, very rarely, then that’s a low risk. That’s great.

One thing to note here it’s easier to estimate the severity than it is the probability. It’s quite easy to under- or overestimate probability. Usually, because of the physical mechanism involved, it’s easier to estimate the severity correctly. If we look on the right-hand side, at negligible. We can see that if we’re confident that something is negligible, then it can be a low risk. But at the very most, it can only be a medium risk. We are effectively prioritizing negligible severity risks quite low down the pecking order.

Now, on the other side, if we think we’ve got a risk that could be catastrophic, we could kill somebody or do irreversible environmental damage, then, however improbable we think it is, it’s never going to be classified less than medium. That’s a good point to note. This matrix has been designed well, in the sense that all catastrophic and critical risks are never going to get the low medium and they can quite easily become serious or high. That means they’re going to get serious management attention. When you put risks up in front of a manager, senior person, a decision-maker, who’s responsible and they see red and orange, they’re going to get uncomfortable and they’re going to want to know all about that stuff. And they will want to be confident that we’ve understood the risk correctly and it’s as low as we can get it. This matrix is designed to get attention and to focus attention where it is needed.

And in this standard, in 88, you ultimately determine whether you can accept risk based on this risk rating. In 882, there is no unacceptable, intolerable risk. You can accept anything if you can persuade the right person with the right amount of authority to sign it off. And the higher the risk, the higher the level of authority you must get in order to accept the risk and expose people to it. This matrix is very important because it prioritizes attention. It prioritizes how much time and effort money gets spent on reducing risks. You will use it to rank things all the time and it also prioritizes, as we’ll see later, how often you review a risk because clearly, you don’t want to have high risks or serious risks. Those are going to get reviewed more often than a medium risk or low risk. A low risk might just get review routinely, not very often, maybe once a year or even less. We want to concentrate effort and attention on high risks and this matrix helps us to do that. But of course, no matrix is perfect.

Now, if we go back. Looking at the yellow highlight, we’re going to use table three unless there’s a tailored alternative definition, a tailored alternative matrix. Now, noting this matrix, catastrophic risk, the highest possible risk, we’ve got one death. Now, if we had a system where it was feasible to kill more than one person in an accident, then really, we would need another column worse than catastrophic. We could imagine that if you had a vehicle that had one person in it and the vehicle crashed, whatever it was, a motorbike let’s say. Let’s imagine you only said ‘We’re only going to have solo riders. We can only kill one person.’. We’re assuming we won’t hurt anybody else. But if you’ve got a car where you’ve got four or more people in, you could kill several people. If you’ve got a coach or a bus, you could drive it off a cliff and kill everybody, or you might have a fire and some people die, but most of them get out. You can see that for some vehicles, for some systems, you would need additional columns. Killing one person isn’t the worst conceivable accident.

Some systems. You might imagine quite easily, say with a ship, it’s actually very rare for a ship to sink and everybody dies. But it’s quite common for individuals on ships to die in health and safety type accidents, workplace accidents. In fact, being a merchant seaman is quite a risky occupation. But also in between those two, it’s also quite possible to have a fire or asphyxiating gases in a compartment. You can kill more than one person, but you won’t kill the entire ship’s company. Straight away in a ship, you can see there are three classes, if you like, of serious accidents where you can kill people. And we knew we should really differentiate between the three when we’re thinking about risk management. And this matrix doesn’t allow you to do that. If you’ve got a system where more than one death this is feasible, then this matrix isn’t necessarily going to serve well, because all of those type of accidents get shoved over into a catastrophic column, on this matrix, and you don’t differentiate between any of between them which is not helpful. You may need to tailor your matrix and add further columns.

And depending on the system, you might want to change the way that those risks are distributed. Because you might have a system, for example riding a bicycle. It’s very common riding a bicycle to get negligible type injuries. You know you fall off, cuts and bruises, that kind of thing. But, if you’re not on the road, let’s say you’re riding off-road it is quite rare to get utilities unless you do a mountain biking on some extreme environment. You’ve got to tailor the matrix for what you’re doing. I think we’ve talked about that enough. We’ll come back to that in later lessons, I’m sure.

Risk Mitigation

Risk mitigation, we’re doing this analysis, not for the sake of it, we’re doing it because we want to do something about it. We want to reduce the risk or eliminate it if we can. 88 2 standard gives us an order of precedence, and as it says it’s specified in section 4.3.4, but I’ve reproduced that here for convenience. Ideally, we would like to eliminate hazards by designing them. We would make a design decision to say, ‘We will- we won’t have a petrol engine, let’s say, in this vehicle or vessel because petrol a serious fire explosion hazard. We’ll have something else. We’ll have diesel or we’ll have an all-electric vehicle maybe these days or something like that.’ We can eliminate the risk.

We could reduce the risk by altering the design introducing sort of failsafe features, or making the design crashworthy, or whatever it might be. We could add engineered features or devices to reduce risk safety features seatbelts in cars or airbags, roll balls, crash survivable cages around the people, whatever it might be. We can provide warning devices to say ‘Something’s going wrong here, and you need to pull over’ or whatever it is you need to do. ‘Watch out!’ because the system is failing and maybe ‘Your brakes are failing. You’ve got low brake fluid. Time to pull over now before it gets worse!’.

And then finally, the least effective precautions or mitigations signage, warning signs – because nobody reads warning signs, sadly. Procedures. Good, if they’re followed. Again, very often people don’t follow them. They cut corners. We train people. Again, they don’t always listen to the training or carry it out. And we provide PPE. That’s personal protective equipment. And again, PPE is great if you enforce it. For example, I live in Australia. If you cycle in Australia, if you ride a bicycle, it’s the law that you wear a bike helmet. Most people obey the law because they don’t want to get a $300 fine or whatever it is if the cops catch you, but you still see people around who don’t wear one. Presumably, because they think they’re bulletproof, and it will never happen to them. PPE is fine if it’s useful. But of course, sometimes PPE can make a job that much harder that people discard it. We really need to think about designing a job to make it easy to do, if we’re going to ask people to wear awkward PPE. Also, by the way, we need to not ask them to wear PPE for trivial reasons just so that the managers can cover their backsides. If you ask people to wear PPE when they’re doing trivial jobs where they don’t need it then it brings the system into disrepute. And then people end up not wearing PPE for jobs where they really should be wearing it. You can over-specify safety and lose goodwill amongst your workers if you’re not careful.

Now those risk mitigation priorities, that’s the one in this standard, but you will see an order of precedence like that in many different countries in the law. It’s the law in Australia. It’s the law in the UK, for example, expressed slightly differently. It’s in lots of different standards for good reason because we want to design out the risks. We want to reduce them in the design because that’s more effective than trying to bolt on or stick home safety afterwards. And that’s another reason why we want to get in early in a project and think about our hazards and our risks early on. Because it’s cheaper at an early stage to say, ‘We will insist on certain things in the design. We will change the requirements to favour a design that is inherently safe.’

Contracting

We only get these things if we contract for them. The model in 88 2, the assumption is it’s a government somewhere contracting a contractor to do stuff. But it doesn’t have to be a government, it can be any client or purchase of world authority or end-user asking for something, buying something, contracting something, be it the physical system, or service, or whatever it might be. The assumption is that the client issues a request for proposal.

Right at the start, they say ‘I want a gizmo’. Or ‘I want- I don’t even want to specify that I want a gizmo. I want something that will do this job. I don’t care what it is. Give me something that will do this job.’ But even at that early stage, we should be asking for preliminary hazard analysis (PHA) to be done. We should be saying, ‘Well, who?’ ‘Which specialists?’ ‘Which functional disciplines need to be involved?’. We need to specify the data that we require and the format that it’s in. Considering, especially the tracking system, which is task 106. If we’re going to get data from lots of different people, best we get it in a standardized format we can put it all together. We want to insist that they identify hazards, hazardous locations, etc. We want to insist on getting technical data on non-developmental items, either getting it for the client or the client supplies it. Says to the contractor or doing it ‘This is the information that I’m going to supply you’ and you will use it. We need to supply the concept of operations and of course, the operating environment. Let me just check, no that that’s it. We’ve only got one slide on commentary. It doesn’t say the environment, but we do need to specify that as well, and hopefully that should be in the concept of operations, and a specific hazard management requirement. For example, what matrix are we going to use? What is a suitable matrix to use for this system?

Now to do all of this, the purchaser, the client really probably needs to have done Task 202 and 201 themselves, and they’ve done some thinking about all of this in order to say, ‘With this system, we can envisage- with this kind of requirement, we can envisage these risks might be applicable.’ And ‘We think that the risks might be large or small’ depending on what the system is or ‘We think that-’. Let’s say if you if you purchase a jet fighter, jet fighters because of that demand, the overwhelming demand for performance, they tend to be a bit riskier than airliners. They fall out of the sky more often. But the advantages, there’s normally only one or two people on board. And jet fighters tend to fly a lot of the time in the middle of nowhere. You’re likely to hurt relatively few people, but it happens more often.

Whereas if you’re buying an airliner something, you can shove a couple of hundred people in at one go, those fall out of the sky much less frequently, thank goodness, but when they do, lots of people get hurt. Aa different approach to risk might be appropriate for different types of system. And when your, you should be thinking about early on, if you’re the client, if you’re the purchaser. You should have done some analysis to enable you to write a good request for proposal, because if you write a bad request for proposal, it’s very difficult to recover the situation afterwards because you start at a disadvantage. And the only way often to fix it is to reissue the RFP and start again. And of course, nobody wants to do that because it’s expensive and it wastes a lot of time. And it’s very embarrassing. It is a career-limiting thing to do, a lot of people. You do need to do some work up front in order to get your RFP correct. That’s what it says in the standard.

Commentary

I want to add a couple of comments, I’m not going to say the much. First, it’s a little line from a poem by Kipling that I find very, very helpful. And Kipling used to be a journalist and it was it was his job to go out and find out what the story was and report it. And to do that he used his six honest serving men. He asked ‘What?’ and ‘Why?’ and “When?’ and ‘Who?’, sorry, and ‘How?’ and ‘Where?’ and ‘Who?’. Those are all good questions to ask. If you can ask all those questions and get definite answers, you’re doing well. And a little tip here as a consultant, I rock up and one of the tricks of the trade I use is I turn up as the ‘dumb consultant’ – I always pretend to be a bit dumber than what I really are- and I ask these stupid questions. And I ask the same questions of several different people. And if I get the same answer to the same question from everyone, I’m happy. But that doesn’t always happen. If you start getting very different answers to the same question from different people, then you think, ‘Okay, I need to do some more digging here’. And it’s the same with hazard analysis. Ask the what, why, when, where and who question.

Another issue, of course, is ‘How much?’ ‘How much is this going to take?’ ‘How long is this going to take?’ ‘How many people am I going to have to invite to this meeting?’, etc. And that’s difficult. And really, the only way to answer these questions properly is to just do some PHI and PHA early and to learn from the results. The other alternative, which we are really good as human beings, is to ask the questions early to get answers that we don’t really like and then just to sweep them under the carpet and not ask those questions ever again because we’re frightened of the answers that we might. However frightened you are of the answer, you might get do ask the question because forewarned is forearmed. And if you know about a problem, you can do something about it. Even if that something is to rewrite your CV and start looking for another job. Do ask the questions even if it makes people uncomfortable. And I guess learning how to ask the questions without making people uncomfortable is one of the tricks that we must learn as safety engineers and consultants. And that’s an important part of the job. The soft skills really that you can only learn through practice, really, and observing people.

What’s the way to do it? Well, I’ve said this several times but do your PHI and PHA early. Do it as early as possible because it’s cheap to do it early. If you’re the only safety person or safety, you often in the beginning, maybe you’re a manager, maybe safety is part of your portfolio, you’ve got other responsibilities as well. Just sit down one day and ask these dumb questions, go through the checklist in Task 202 and say, ‘Do I have these things in my system?’

If you know for sure you’re not going to have explosive ordnance, or radiation, or whatever it might be, you can go, ‘Great. I can cross those off the list’. I can make an assumption or I can put a constraint in, by the way, if you really want to do it well and say ‘We will have no explosive devices’, ‘We will have no energetic materials.’, ‘We will have no radiation’ or whatever it might be. Make sure that you insist that you’ll have none of it then you can hopefully move on and never have to deal with those issues again.

Do the analysis early, but expect to repeat it because things change, and you learn more and more information comes in. But of course, the further you go down the project, the more expensive everything gets. Now, having said do it, do it early, the Catch 22 is very often people think ‘How can I analyse when I don’t have a design?’

The catch 22 question is what comes first, design or analysis? Now, the truth is that you could do analysis on very simple functions. You don’t need any design at all. You don’t even need to know what kind of vehicle or what kind of system you might dealing with. But of course, that will only take you so far. And it may be that you want to do early analysis, but for whatever reason, [Intellectual Property Rights] IPR or whatever it might be, you can’t get access to data.

What do you do? You can’t get access to data about your system or the system that you’re replacing. What do you do? Well, one of the things you can do is you can borrow an idea from the logistics people. Logistic support analysis Task 203 is a baseline comparison system. Imagine that you’re going to have a new system, maybe is replacing an old system, but maybe it does a lot more than the old system used to do. Just looking at the old system isn’t going to give you the full picture. Maybe what you need to do is make up an imaginary comparison system. You take the old system and say, ‘Well, I’m adding all this extra functionality’. Maybe the old system, we just bought the vehicle. We didn’t buy the support system, we didn’t buy the weapons, we didn’t buy the training, whatever it might be. But, this time round, we’re buying the complete package. We’re going to have all this extra stuff that probably has hazards associated with it, but just doing lessons learned from the previous system will not be enough.

Maybe you need to construct an imaginary Baseline Comparison System and go, ‘I’ll borrow bits from all these other systems, put them all together, and then try and learn from that sort of composite system that I’ve invented, even though it’s imaginary.’ That can be a very powerful technique. You may get told, ‘Oh, we haven’t got the money’ or ‘We haven’t got the time to do that’. But to be honest, if there’s no other way of doing effective, early analysis, then spend the money and do it early. Because many times I’ve seen people go, ‘Oh, we haven’t got time to do that’. They’ve never got time to do it properly and therefore, you end up doing it. You go around the buoy two or three times. You do it badly. You do it again slightly less badly. You do it a third time. And it’s sort of barely adequate. And then you move forward. Well, you’ve wasted an awful lot of time and money and held up other people, the rest of the project doing that. Probably it’s better off to spend the money and just get on with it. And then you’re informed going forwards before you start to spend serious money elsewhere on the project.

Copyright Statement

Well, that’s it for me. Just one thing to say, that Mil. Standard 882E came out in 2012. Still going strong, unlikely to be replaced anytime soon. It’s copyright free. All the quotations are from the standard, they’re copyright free. But this video is copyright of The Safety Artisan 2020.

For More …

And you can find a lot more information, a lot more safety videos, at The Safety Artisan page at www.Patreon.com and you can find more resources at www.safetyartisan.com.

End

That is the end of the show. Thank you very much for listening. And it just remains for me to say. Come and watch some more videos on Mill-Std-882E. There’s going to be a complete course on them, and you should be able to get, I hope, a lot of value out of the course. So, until I see you again, cheers.

End: Preliminary Hazard Analysis

You can find a free pdf of the System Safety Engineering Standard, Mil-Std-882E, here.

Categories
Work Health and Safety

Intro to Work Health and Safety

This short video Intro to Work Health and Safety looks at Australian legislation that is relevant to System Safety. Thus, it is of interest to system, functional and design safety practitioners.  It looks at the three classes of ‘upstream’ safety duties of designers, that also apply to manufacturers, importers, suppliers those who install/commission plant substances and structures. 

Intro to Work Health and Safety: so What?

Many people think the WHS Act only applies to the management of safety in the workplace. They’re wrong – it does much more than that. In this short presentation, I am going to show you why the WHS Act is relevant to those with ‘upstream’ safety responsibilities such as designers.

Intro to Work Health and Safety: Topics

  • The primary duty of care;
  • Safety duties of designers (Section 21); and
  • Similar duties apply to others, such as:
    • Manufacturers (Section 23);
    • Importers (Section 24);
    • Suppliers (Section 25);
    • Those installing, constructing or commissioning (Section 26);
    • Officers (Section 27); and
    • Workers (Section 28).

Intro to Work Health and Safety: Transcript

Click Here for the Transcript

Hi everyone and welcome to the Safety Artisan where you will find Professional, pragmatic And impartial Instruction on safety. Which we hope you enjoy. So today we’re talking about the Work Health and Safety (WHS) Act in Australia. Which is surprisingly relevant to what we do in Fact. Let’s see how surprising and relevant it is.

Were going to look at the WHS Act. And its relevance to what we’re talking about here on the Safety Artisan. And it’s important to answer that question first, The “So what” test. Many people think that the WHS Act is only applicable To safety In the workplace. So they see it as purely an occupational health and safety Piece of legislation.

And it isn’t!

It does do that, but it does so much more as well.
And in this short presentation, I’m going to show you why The WHS act is relevant. To system safety, functional safety, design safety, Whatever we want to call it.

Now I’m actually looking up some information On the work Health and Safety Act, from The Federal Register of Legislation. And, (In blue letters.) And if we go down to the bottom left-hand side of the screen. We will see
A little map of Australia with a big red tick on it. And in green, it says ‘in force latest version’. So I looked at the Website Today, the 6th of October. And this is the latest version. Which is just to make sure that We’ve got the right version. In Australia the Jurisdiction of which version of the act is in place Is complex. I’m not going to talk about that in the short session but I will in the full video version.

The Primary Duty of Care under the WHS Act

The Primary Duty of Care under the WHS Act is as follows. So a person Conducting a business or undertaking and – a Person Conducting a Business or Undertaking is usually abbreviated to PCBU. A horrible, horrible, clunky term! What it’s trying to say is whether you’re doing business or it is non-profit. Whether you work for the government. Or even if you’re self-employed. Whoever you are and whatever you do. If it’s to do with work, being paid for work. Then this applies to you.

Those people doing this stuff Are responsible For ensuring the health and
safety Of workers, who are engaged or paid by the person, by the PCBU. Workers whose activities are influenced or directed by the PCBU while they’re at work. And also the PCBU must ensure the health and safety of Other people. So in the vicinity of the workplace let’s say, or Maybe visitors.

As always the caveat on this ‘ensuring’ Health and Safety is ‘So Far As is reasonably Practicable’. Again we’re not going to be talking about So far as is reasonably practicable in this session, we’ll talk about it in the longer session; and, in fact, I think I’m probably going to do a session Just on the how to do So far as is Reasonably Practicable Because A lot of people Get it wrong. It’s quite a different concept. If you’re not used to it.

Designer Duties under the WHS Act

Moving on. We’ve jumped from Section 19 to Section 22. And we’re now talking about the duties of designers. Well, this doesn’t sound like occupational health and safety does it? So we look at the designer duties of PCBUs who design Plant, Substances, Or structures. So we’re talking industrial plant we’re not talking about commercial goods. There are other
Acts that apply to stuff that you would buy in a shop. So this is industrial plant, Chemical substances and the like. And structures and those might be buildings. Or they might be ships, floating platforms, whatever they might be. Aircraft. Cars.

The First WHS Duty of a Designer

So here we have The First Duty of a designer. And there are three groups of duties. First of all, The designer Has to ensure The health and safety of People in the workplace. If they’re designing plant. If they’re designing or creating. A substance, or A structure. That is to be used, Or might reasonably be expected to be used At a workplace. This duty applies to them. So they’ve got to do whatever it takes. To ensure Health and Safety So far as is reasonably practicable.

Now, carrying on from that. We get a bit more detail. So the designer has got to ensure, so far as is reasonably practicable, that plant, substance or structure Is designed To be without risks. The risks are To the health and safety of persons, who Are At a workplace. Who might, Use it For the purpose for which it was designed, Who might Handle the substance. Who might store the plant or substance? And who might construct a structure? Or, and here’s the catch-all, who might carry out any reasonably foreseeable activity At a workplace In relation to this plant, substance, or structure.

And then if we go on to Part (e)(i) And we now get a long list of stuff. Any reasonably foreseeable activity Includes manufacture, assembly, Use, Proper storage, decommissioning, dismantling, disposal, Etc. We run out of space there. But the bottom line is that the scope of this act is cradle to grave. So from the very first time that we Design A plant, substance or structure. Right through to final disposal of said, Plant Substance and structure. The Designer has safety responsibilities. Thinking about the whole lifecycle of This stuff.

The Second WHS Duty of a Designer

Now we move on to the other Two duties that a designer has. So in subsection 3. The designer has a duty to carry out testing. That’s what it says in the guide. Actually, if you look at the words in the act it says the designer must carry out or arrange for Calculations, analysis, testing, Or examination. Whatever is necessary for the performance of the duty that We just described In Subsection 2. You recall Subsection 2, cradle to grave, from creation to final disposal. Calculations, analysis, testing or examination Might be needed. The designer has got to Carry that out Or arrange it. In order to ensure safety SFARP.

The Third WHS Duty of a Designer

And then, our Final Duty Is having done all of that work. Having designed this stuff to be safe and done all the Calculations and testing. The designer must give Adequate information to each person provided with the design. And the purpose of doing so, We’re not just providing information for the sake of it, or because we felt like it. It’s provided for a specific purpose. So each Purpose, Which the plant, substance or structure was designed. So we need all the information associated With its design purpose.
We’ve got to provide the results of those calculations, analysis, testing and
examination.

And, Probably this is also equally Crucial from a hazard analysis point of view, Any conditions necessary to ensure that the plant, substance or structure Is without risk to health and safety. When it is used for the purpose for which it was designed, Or, (All the other stuff If we go back to
Section 2.)

So Section 4, Does actually say this applies to Section 2(a-e). But we ran out of space on the page, so the designers got to provide all the information necessary. for people to use this stuff and for the life cycle of whatever it is from cradle to grave. Now, If we look at Section 4(a-c), We can say that’s the kind of information we generate from Hazard Analysis from safety analysis. So, yeah, Absolutely We need system safety In order to meet these duties, to satisfy these duties.

A Consistent set of Duties Across the Supply Chain

And these duties are not just on designers, because the WHS Act Is actually Very, very clever. Because it applies Much the same duties, those three duties that we heard of. The duty to ensure health and safety. The duty to test and analyze. And the duty to provide information. If we look at Sections 22, Through 26, We find that very similar duties apply
To designers.
To manufacturers.
To importers.
To suppliers.
And to those installing, constructing, Or commissioning. Substances and
Structures.
And the duties in these sections are all consistent. Basically, it recognizes that there is a supply chain. From design right through to installation and commissioning. And Everybody in that chain Has duties To do their part correctly, or to test what they have to. Pass on information, To the next set of stakeholders.

And then, In addition to that, If we looked in Section 27 we would see the Officers Of the PCBU, so Company directors and the like, People with, major influence, Who are able to direct operations and that kind of thing. So senior management and directors of companies and the equivalent in the public sector Have special requirements applying to them. Again, We’re going to talk about that in the Main Video, Not in this one. And then workers have Duties to Comply with reasonable instructions, That are intended to keep safe And other workers [safe]. So that if we go to Section 28 you get the kind of thing that you would expect to see in work-place safety.

Copyright and Attribution

So that’s it In the short video. Just to mention that I have Shown you information From the Federal Register of Legislation. I’m entitled to do that under the Creative Commons license. And I’m making the required attribution statement. You can see it in the middle of the Screen. And for the full information on these terms on copyright and attribution, Please go to that page On my website. And you will find full details of the terms and conditions, under which this video was created. And if you want to see the full version of the introduction to the WHS Act, which is going to cover a lot more ground than this then please go to the Safety Artisan page On www.Patreon.com.

That’s the Presentation. And it just remains for me to say, Thanks very much for listening. I look forward to meeting you again. Cheers now.

The Full Version is Here…

If you want more, if you want a wider and deeper view of the WHS Act, then there’s a longer version of this video. Which you can get at my Patreon page.

I hope you enjoy it. Well that’s it for the short video, for now. Please go and have a look at the longer video to get the full picture. OK, everyone, it’s been a pleasure talking to you and I hope you found that useful. I’ll see you again soon. Goodbye.

The full-length ‘Guide to WHS’ video is here. Back to the WHS Topic Page.