Categories
Mil-Std-882E System Safety

Learn How to Perform System Safety Analysis

In this ‘super post,’ you’re going to Learn How to Perform System Safety Analysis. I’m going to point you to twelve posts that explain each of the ten analysis tasks, the analysis process, and how to combine the tasks into a program!

Follow the links to sample and buy lessons on individual tasks. You can get discount deals on a bundle of three tasks, or all twelve.

Discounts

Click here for a half-price deal on the three essential tasks: Preliminary Hazard Identification, Preliminary Hazard Analysis, and Safety Requirements Hazard Analysis.

Click here for a bumper deal on all twelve lessons:

  • System Safety Process;
  • Design your System Safety Program; and
  • All ten System Safety Analysis tasks.

Introduction

Military Standard 882, or Mil-Std-882 for short, is one of the most widely used system-safety standards. As the name implies, this standard is used on US military systems, but it has found its way, sometimes in disguise, into many other programs around the world. It’s been around for a long time and is now in its fifth incarnation: 882E.

Unfortunately, 882 has also been widely misunderstood and misapplied. This is probably not the fault of the standard and is just another facet of its popularity. The truth is that any standard can be applied blindly – no standard is a substitute for competent decision-making.

In this series of posts, we will: provide awareness of this standard; explain how to use it; and discuss how to manage, tailor, and implement it. Links to each training session and to each section of the standard are provided in the following sections.

Mil-Std-882E Training Sessions

System Safety Process, here

Photo by Bonneval Sebastien on Unsplash

In this full-length (50 minutes) video, you will learn to:

  • Know the system safety process according to Mil-Std-882E;
  • List and order the eight elements;
  • Understand how they are applied;
  • Skilfully apply system safety using realistic processes; and
  • Feel more confident dealing with multiple standards.

In System Safety Process, we look a the general requirements of Mil-Std-882E. We cover the Applicability of the 882E tasks; the General requirements; the Process with eight elements; and the application of process theory to the real world.

Design Your System Safety Analysis Program

Photo by Christina Morillo from Pexels

Learn how to Design a System Safety Program for any system in any application.

Learning Objectives. At the end of this course, you will be able to:

  • Define what a risk analysis program is;
  • List the hazard analysis tasks that make up a program;
  • Select tasks to meet your needs; and
  • Design a tailored risk analysis program for any application.

This lesson is available as part of the twelve-lesson bundle (see the bottom of this post) or you can get it as part of my ‘SSRAP’ course at Udemy here.

Analysis: 200-series Tasks

Preliminary Hazard Identification, Task 201

Identify Hazards.

In this video, we find out how to create a Preliminary Hazard List, the first step in safety assessment. We look at three classic complementary techniques to identify hazards and their pros and cons. This includes all the content from Task 201, and also practical insights from my 25 years of experience with Mil-Std-882.

Preliminary Hazard Analysis, Task 202

See More Clearly.

In this 45-minute session, The Safety Artisan looks at Preliminary Hazard Analysis, or PHA, which is Task 202 in Mil-Std-882E. We explore Task 202’s aim, description, scope, and contracting requirements. We also provide value-adding commentary and explain the issues with PHA – how to do it well and avoid the pitfalls.

System Requirements Hazard Analysis, Task 203

Law, Regulations, Codes of Practice, Guidance, Standards & Recognised Good Practice.

In this 45-minute session, The Safety Artisan looks at Safety Requirements Hazard Analysis, or SRHA, which is Task 203 in the Mil-Std-882E standard. We explore Task 203’s aim, description, scope, and contracting requirements. SRHA is an important and complex task, which needs to be done on several levels to be successful. This video explains the issues and discusses how to perform SRHA well.

Triple bundle Offer

Click here for a half-price deal on the three essential tasks: Preliminary Hazard Identification, Preliminary Hazard Analysis, and Safety Requirements Hazard Analysis.

Sub-system Hazard Analysis, Task 204

Breaking it down to the constituent parts.

In this video lesson, The Safety Artisan looks at Sub-System Hazard Analysis, or SSHA, which is Task 204 in Mil-Std-882E. We explore Task 204’s aim, description, scope, and contracting requirements. We also provide value-adding commentary and explain the issues with SSHA – how to do it well and avoid the pitfalls.

System Hazard Analysis, Task 205

Putting the pieces of the puzzle together.

In this 45-minute session, The Safety Artisan looks at System Hazard Analysis, or SHA, which is Task 205 in Mil-Std-882E. We explore Task 205’s aim, description, scope, and contracting requirements. We also provide value-adding commentary, which explains SHA – how to use it to complement Sub-System Hazard Analysis (SSHA, Task 204) in order to get the maximum benefits for your System Safety Program.

Operating and Support Hazard Analysis, Task 206

Operate it, maintain it, supply it, dispose of it.

In this full-length session, The Safety Artisan looks at Operating & Support Hazard Analysis, or O&SHA, which is Task 206 in Mil-Std-882E. We explore Task 205’s aim, description, scope, and contracting requirements. We also provide value-adding commentary, which explains O&SHA: how to use it with other tasks; how to apply it effectively on different products; and some of the pitfalls to avoid. We refer to other lessons for specific tools and techniques, such as Human Factors analysis methods.

Health Hazard Analysis, Task 207

Hazards to human health are many and various.

In this full-length (55-minute) session, The Safety Artisan looks at Health Hazard Analysis, or HHA, which is Task 207 in Mil-Std-882E. We explore the aim, description, and contracting requirements of this complex Task, which covers: physical, chemical & biological hazards; Hazardous Materials (HAZMAT); ergonomics, aka Human Factors; the Operational Environment; and non/ionizing radiation. We outline how to implement Task 207 in compliance with Australian WHS. 

Functional Hazard Analysis, Task 208

Components where systemic failure dominates random failure.

In this full-length (40-minute) session, The Safety Artisan looks at Functional Hazard Analysis, or FHA, which is Task 208 in Mil-Std-882E. FHA analyses software, complex electronic hardware, and human interactions. We explore the aim, description, and contracting requirements of this Task, and provide extensive commentary on it. 

System-Of-Systems Hazard Analysis, Task 209

Existing systems are often combined to create a new capability.

In this full-length (38-minute) session, The Safety Artisan looks at Systems-of-Systems Hazard Analysis, or SoSHA, which is Task 209 in Mil-Std-882E. SoSHA analyses collections of systems, which are often put together to create a new capability, which is enabled by human brokering between the different systems. We explore the aim, description, and contracting requirements of this Task, and an extended example to illustrate SoSHA. (We refer to other lessons for special techniques for Human Factors analysis.)

Environmental Hazard Analysis, Task 210

Environmental requirements in the USA, UK, and Australia.

This is the full (one hour) session on Environmental Hazard Analysis (EHA), which is Task 210 in Mil-Std-882E. We explore the aim, task description, and contracting requirements of this Task, but this is only half the video. We then look at environmental requirements in the USA, UK, and Australia, before examining how to apply EHA in detail under the Australian/international regime. This uses my practical experience of applying EHA. 

Discounts

Click here for a half-price deal on the three essential tasks: Preliminary Hazard Identification, Preliminary Hazard Analysis, and Safety Requirements Hazard Analysis.

Click here for a bumper deal on all twelve lessons:

  • System Safety Process;
  • Design your System Safety Program; and
  • All ten System Safety Analysis tasks.
Categories
System Safety

FAQ on System Safety

In this FAQ on System Safety, I share some lessons that will explain the basics right through to more advanced topics!

The system safety concept calls for a risk management strategy based on identification, analysis of hazards and application of remedial controls using a systems-based approach.

Harold E. Roland; Brian Moriarty (1990). System Safety Engineering and Management.

In ‘Safety Concepts Part 1’, we look at the meaning of the term “safe”. This fundamental topic provides the foundation for all other safety topics, and it’s simple!

In this 45-minute free video, I discuss System Safety Principles, as set out by the US Federal Aviation Authority in their System Safety Handbook

In System Safety Programs, we learn how to Design a System Safety Program for any system in any application.

The Common System Safety Questions

To see them click here:

is system safety, system safety is, what’s system safety, what is system safety management, what is system safety assessment, what is a system safety program plan, what is safety system of work, [what is safe system of work], what’s system safety, which active safety system, why system safety, system safety faa, system safety management, system safety management plan, system safety mil std, system safety methodology, system safety mil-std-882d, system safety mil-std-882e, system safety program plan, system safety process, system safety ppt system safety principles, system safety perspective, system safety precedence, system safety analysis, system safety analysis handbook, system safety analysis techniques, system safety courses, system safety assessment.

System safety is a specialty within system engineering that supports program risk management. … The goal of System Safety is to optimize safety by the identification of safety related risks, eliminating or controlling them by design and/or procedures, based on acceptable system safety precedence.

FAA System Safety Handbook, Chapter 3: Principles of System Safety
December 30, 2000

If you don’t find what you want in this FAQ on Risk Management, there are plenty more lessons under Start Here and System Safety Analysis topics. Or just enter ‘system safety’ into the search function at the bottom of any page.

Categories
System Safety

Reflections on a Career in Safety, Part 4

In ‘Reflections on a Career in Safety, Part 4’, I want to talk about Consultancy, which is mostly what I’ve been doing for the last 20 years!

Consultancy

As I said near the beginning, I thought that in the software supportability team, we all wore the same uniform as our customers. We didn’t cost them anything. We were free. We could turn up and do a job. You would think that would be an easy sell, wouldn’t you?

Not a bit of it.  People want there to be an exchange of tokens. If we’re talking about psychology, if something doesn’t cost them anything, they think, well, it can’t be worth anything. So [how much] we pay for something really does affect our perception of whether it’s any good.

Photo by Cytonn Photography on Unsplash

So I had to go and learn a lot of sales and marketing type stuff in order to sell the benefits of bringing us in, because, of course, there was always an overhead of bringing new people into a program, particularly if they were going to start asking awkward questions, like how are we going to support this in service? How are we going to fix this? How is this going to work?

So I had to learn a whole new language and a whole new way of doing business and going out to customers and saying, we can help you, we can help you get a better result. Let’s do this. So that was something new to learn. We certainly didn’t talk about that at university.  Maybe you do more business focussed stuff these days. You can go and do a module, I don’t know, in management or whatever; very, very useful stuff, actually. It’s always good to be able to articulate the benefits of doing something because you’ve got to convince people to pay for it and make room for it.

Doing Too Little, or Too Much

And in safety, I’ve got two [kinds of] jobs.

First of all, I suppose it’s the obvious one. Sometimes you go and see a client, they’re not aware of what the law says they’re supposed to do or they’re not aware that there’s a standard or a regulation that says they’ve got to do something – so they’re not doing it. Maybe I go along and say, ah, look, you’ve got to do this. It’s the law. This is what we need to do.

Photo by Quino Al on Unsplash

Then, there’s a negotiation because the customer says, oh, you consultants, you’re just making up work so you can make more money. So you’ve got to be able to show people that there’s a benefit, even if it’s only not going to jail. There’s got to be a benefit. So you help the clients to do more in order to achieve success.

You Need to Do Less!

But actually, I spend just as much time advising clients to do less, because I see lots of clients doing things that appear good and sensible. Yes, they’re done with all the right motivation. But you look at what they’re doing and you say, well, this you’re spending all this money and time, but it’s not actually making a difference to the safety of the product or the process or whatever it is.

You’re chucking money away really, for very little or no effect.  Sometimes people are doing work that actually obscures safety. They dive into all this detail and go, well, actually, you’ve created all this data that’s got to be managed and that’s actually distracting you from this thing over here, which is the thing that’s really going to hurt people.

So, [often] I spend my time helping people to focus on what’s important and dump the comfort blanket, OK, because lots of times people are doing stuff because they’ve always done it that way, or it feels comforting to do something. And it’s really quite threatening to them to say, well, actually, you think you’re doing yourself a favor here, but it doesn’t actually work. And that’s quite a tough sell as well, getting people to do less.

Photo by Prateek Katyal on Unsplash

However, sometimes less is definitely more in terms of getting results.

Part 5 will follow next week!

New to System Safety? Then start here. There’s more about The Safety Artisan here. Subscribe for free regular emails here.

Categories
System Safety

Reflections on a Career in Safety, Part 3

In ‘Reflections on a Career in Safety, Part 3’ I continue talking about different kinds of Safety, moving onto…

Projects and Products

Then moving on to the project side, where teams of people were making sure a new aeroplane, a new radio, a new whatever it might be, was going to work in service; people were going to be able to use it, easily, support it, get it replaced or repaired if they had to. So it was a much more technical job – so lots of software, lots of people, lots of process and more people.

Moving to the software team was a big shock to me. It was accidental. It wasn’t a career move that I had chosen, but I enjoyed it when I got there.  For everything else in the Air Force, there was a rule. There was a process for doing this. There were rules for doing that. Everything was nailed down. When I went to the software team, I discovered there are no rules in software, there are only opinions.

The ‘H’ is software development is for ‘Happiness’

So straight away, it became a very people-focused job because if you didn’t know what you were doing, then you were a bit stuck.  I had to go through a learning curve, along with every other technician who was on the team. And the thing about software with it being intangible is that it becomes all about the process. If a physical piece of kit like the display screen isn’t working, it’s pretty obvious. It’s black, it’s blank, nothing is happening. It’s not always obvious that you’ve done something wrong with software when you’re developing it.

So we were very heavily reliant on process; again, people have got to decide what’s the right process for this job? What are we going to do? Who’s going to do it? Who’s able to do it? And it was interesting to suddenly move into this world where there were no rules and where there were some prima donnas.

Photo by Sandy Millar on Unsplash

We had a handful of really good programmers who could do just about anything with the aeroplane, and you had to make the best use of them without letting them get out of control.  Equally, you had people on the other end of the scale who’d been posted into the software team, who really did not want to be there. They wanted to get their hands dirty, fixing aeroplanes. That’s what they wanted to do. Interesting times.

From the software team, I moved on to big projects like Eurofighter, that’s when I got introduced to:

Systems Engineering

And I have no problem with plugging systems engineering because as a safety engineer, I know [that] if there is good systems engineering and good project management, I know my job is going to be so much easier. I’ve turned up on a number of projects as a consultant or whatever, and I say, OK, where’s the safety plan? And they say, oh, we want you to write it. OK, yeah, I can do that. Whereas the project management plan or where’s the systems engineering management plan?

If there isn’t one or it’s garbage – as it sometimes is – I’m sat there going, OK, my just my job just got ten times harder, because safety is an emergent property. So you can say a piece of kit is on or off. You can say it’s reliable, but you can’t tell whether it’s safe until you understand the context. What are you asking it to do in what environment? So unless you have something to give you that wider and bigger picture and put some discipline on the complexity, it’s very hard to get a good result.

Photo by Sam Moqadam on Unsplash

So systems engineering is absolutely key, and I’m always glad to work with the good systems engineer and all the artifacts that they’ve produced. That’s very important. So clarity in your documentation is very helpful. Being [involved], if you’re lucky, at the very beginning of a program, you’ve got an opportunity to design safety, and all the other qualities you want, into your product. You’ve got an opportunity to design in that stuff from the beginning and make sure it’s there, right there in the requirements.

Also, systems engineers doing the requirements, working out what needs to be done, what you need the product to do, and just as importantly, what you need it not to do, and then passing that on down the chain. That’s very important. And I put in the title “managing at a distance” because, unlike in the operations world where you can say “that’s broken, can you please go and fix it”.

Managing at a Distance

It’s not as direct as that.  You’re looking at your process, you’re looking at the documentation, you’re working with, again, lots and lots of people, not all of whom have the same motivation that you do.

Photo by Bonneval Sebastien on Unsplash

Industry wants to get paid. They want to do the minimum work to get paid, [in order] to maximize their profit. You want the best product you can get. The pilots want something that punches holes in the sky and looks flash and they don’t really care much about much else, because they’re quite inoculated to risk.

So you’ve got people with competing motivations and everything has got to be worked indirectly. You don’t get to control things directly. You’ve got to try and influence and put good things in place, in almost an act of faith that, [you put] good things in place and good things will result.  A good process will produce a good product. And most of the time that’s true. So (my last slide on work), I ended up doing consultancy, first internally and then externally.

Part 4 will follow next week!

New to System Safety? Then start here. There’s more about The Safety Artisan here. Subscribe for free regular emails here.

Categories
System Safety

Reflections on a Career in Safety, Part 2

In ‘Reflections on a Career in Safety, Part 2’ I move on to …

Different Kinds of Safety

So I’m going to talk a little bit about highlights, that I hope you’ll find useful.  I went straight from university into the Air Force and went from this kind of [academic] environment to heavy metal, basically.  I guess it’s obvious that wherever you are if you’re doing anything in industry, workplace health and safety is important because you can hurt people quite quickly. 

Workplace Health and Safety

In my very first job, we had people doing welding, high voltage electrics, heavy mechanical things; all [the equipment was] built out of centimeter-thick steel. It was tough stuff and people still managed to bend it. So the amount of energy that was rocking around there, you could very easily hurt people.  Even the painters – that sounds like a safe job, doesn’t it? – but aircraft paint at that time a cyanoacrylate. It was a compound of cyanide that we used to paint aeroplanes with.

All the painters and finishers had to wear head-to-toe protective equipment and breathing apparatus. If you’re giving people air to breathe, if you get that wrong, you can hurt people quite quickly. So even managing the hazards of the workplace introduced further hazards that all had to be very carefully controlled.

Photo by Ömer Yıldız on Unsplash

And because you’re in operations, all the decisions about what kind of risks and hazards you’re going to face, they’ve already been made long before.  Decisions that were made years ago, when a new plane or ship or whatever it was, was being bought and being introduced [into service]. Decisions made back then, sometimes without realizing it, meant that we were faced with handling certain hazards and you couldn’t get rid of them. You just had to manage them as best you could.

Overall, I think we did pretty well. Injuries were rare, despite the very exciting things that we were dealing with sometimes.  We didn’t have too many near misses – not that we heard about anyway. Nevertheless, that [risk] was always there in the background. You’re always trying to control these things and stop them from getting out of control.

One of the things about a workplace in operations and support, whether you’re running a fleet of aeroplanes or you’re servicing some kit for somebody else and then returning it to them, it tends to be quite a people-centric job. So, large groups of people doing the job, supervision, organization, all that kind of stuff.  And that can all seem very mundane, a lot of HR-type stuff. But it’s important and it’s got to be dealt with.

So the real world of managing people is a lot of logistics. Making sure that everybody you need is available to do the work, making sure that they’ve got all the kit, all the technical publications that tell them what to do, the information that they need.  It’s very different to university – a lot of seemingly mundane stuff – but it’s got to be got right because the consequences of stuffing up can be quite serious.

Safe Systems of Work

So moving on to some slightly different topics, when I got onto working with Aeroplanes, there was an emphasis on a safe system of work, because doing maintenance on a very complex aeroplane was quite an involved process and it had to be carefully controlled. So we would have what’s usually referred to as a Permit to Work system where you very tightly control what people are allowed to do to any particular plane. It doesn’t matter whether it’s a plane or a big piece of mining equipment or you’re sending people in to do maintenance on infrastructure; whatever it might be, you’ve got to make sure that the power is disconnected before people start pulling it apart, et cetera, et cetera.

Photo by Leon Dewiwje on Unsplash

And then when you put it back together again, you’ve got to make sure that there aren’t any bits leftover and everything works before you hand it back to the operators because they’re going to go and do some crazy stuff with it. You want to make sure that the plane works properly. So there was an awful lot of process in that. And in those days, it was a paperwork process. These days, I guess a lot would be computerized, but it’s still the same process.

If you muck up the process, it doesn’t matter whether [it is paper-based or not].  If you’ve got a rubbish process, you’re going to get rubbish results and it [computerization] doesn’t change that. You just stuff up more quickly because you’ve got a more powerful tool. And for certain things we had to take, I’ve called it special measures. In my case, we were a strike squadron, which meant our planes would carry nuclear weapons if they had to.

Special Processes for Special Risks

So if the Soviets charged across the border with 20,000 tanks and we couldn’t stop them, then it was time to use – we called them buckets of sunshine. Sounds nice, doesn’t it? Anyway, so there were some fairly particular processes and rules for looking after buckets of sunshine. And I’m glad to say we only ever used dummies. But when you when the convoy arrived and yours truly has to sign for the weapon and then the team starts loading it, then that does concentrate your mind as an engineer. I think I was twenty-two, twenty-three at the time.  

Photo by Oscar Ävalos on Unsplash

Somebody on [our Air Force] station stuffed up on the paperwork and got caught. So that was two careers of people my age, who I knew, that were destroyed straight away, just by not being too careful about what they were doing. So, yeah, that does concentrate the mind.  If you’re dealing with, let’s say you’re in a major hazard facility, you’re in a chemical plant where you’ve got perhaps thousands of tonnes of dangerous chemicals, there are some very special risk controls, which you have to make sure are going to work most of the time.

And finally, there is ‘airworthiness’: decisions about whether we could fly an aeroplane, even though some bits of it were not working. So that was a decision that I got to make once I got signed off to do it. But it’s a team job. You talk to the specialists who say, this bit of the aeroplane isn’t working, but it doesn’t matter as long as you don’t do “that”.

Photo by Eric Bruton on Unsplash

So you have to make sure that the pilots knew, OK, this isn’t working.  This is the practical effect from your [operator’s] point of view. So you don’t switch this thing on or rely on this thing working because it isn’t going to work. There were various decisions about [airworthiness] that were an exciting part of the job, which I really enjoyed.  That’s when you had to understand what you were doing, not on your own, because there were people who’d been there a lot longer than me.  But we had to make things work as best we could – that was life.

Part 3 will follow next week!

New to System Safety? Then start here. There’s more about The Safety Artisan here. Subscribe for free regular emails here.

Categories
System Safety

Reflections on a Career in Safety, Part 1

This is Part 1 of my ‘Reflections on a Career in Safety’, from “Safety for Systems Engineering and Industry Practice”, a lecture that I gave to the University of Adelaide in May 2021. My thanks to Dr. Kim Harvey for inviting me to do this and setting it up.

The Lecture, Part 1

Hi, everyone, my name Simon Di Nucci and I’m an engineer, I actually – it sounds cheesy – but I got into safety by accident. We’ll talk about that later. I was asked to talk a little bit about career stuff, some reflections on quite a long career in safety, engineering, and other things, and then some stuff that hopefully you will find interesting and useful about safety work in industry and working for government.

Context: my Career Summary

I’ve got three areas to talk about, operations and support, projects and product development, and consulting.

I have been on some very big projects, Eurofighter, Future Submarine Programme, and some others that have been huge multi-billion-dollar programs, but also some quite small ones as well. They’re just as interesting, sometimes more so. In the last few years, I’ve been working in consultancy. I have some reflections on those topics and some brief reflections on a career in safety.

Starting Out in the Air Force

So a little bit about my career to give you some context. I did 20 years in the Royal Air Force in the U.K., as you can tell from my accent, I’m not from around here. I started off fresh out of university, with a first degree in aerospace systems engineering. And then after my Air Force training, my first job was as an engineering manager on ground support equipment: in General Engineering Flight, it was called.

We had people looking after the electrical and hydraulic power rigs that the aircraft needed to be maintained on the ground. And we had painters and finishers and a couple of carpenters and a fabric worker and some metal workers and welders, that kind of stuff. So I went from a university where we were learning about all this high-tech stuff about what was yet to come in the aerospace industry. It was a bit of the opposite end to go to, a lot of heavy mechanical engineering that was quite simple.

And then after that, we had a bit of excitement because six weeks after I started, in my very first job, the Iraqis invaded Kuwait.  I didn’t go off to war, thank goodness, but some of my people did. We all got ready for that: a bit of excitement.

Photo by Jacek Dylag on Unsplash

After that, I did a couple of years on a squadron, on the front line. We were maintaining and fixing the aeroplanes and looking after operations. And then from there, I went for a complete change. Actually, I did three years on a software maintenance team and that was a very different job, which I’ll talk about later. I had the choice of two unpleasant postings that I really did not want, or I could go to the software maintenance team.

Into Software by accident as well!

I discovered a burning passion to do software to avoid going to these other places. And that’s how I ended up there. I had three, fantastic years there and really enjoyed that. Then, I was thinking of going somewhere down south to be in the UK, to be near family, but we went further north. That’s the way things happen in the military.

I got taken on as the rather grandly titled Systems and Software Specialist Officer on the Typhoon Field Team. The Eurofighter Typhoon wasn’t in service at that point. (That didn’t come in until 2003 when I was in my last Air Force job, actually.)  We had a big team of handpicked people who were there to try and make sure that the aircraft was supportable when it came into service.

One of the big things about the new aircraft was it had tons of software on board.  There were five million lines of code on board, which was a lot at the time, and a vast amount of data. It was a data hog; it ate vast amounts of data and it produced vast amounts of data and that all needed to be managed. It was on a scale beyond anything we’d seen before. So it was a big shock to the Air Force.

More Full-time Study

Photo by Mike from Pexels

Then after that, I was very fortunate.  (This is a picture of York, with the minister in the background.) I spent a year full-time doing the safety-critical systems engineering course at York, which was excellent.  It was a privilege to be able to have a year to do that full-time. I’ve watched a lot of people study part-time when they’ve got a job and a family, and it’s really tough. So I was very, very pleased that I got to do that.

After that, I went to do another software job where this time we were in a small team and we were trying to drive software supportability into new projects coming into service, all kinds of stuff, mainly aircraft, but also other things as well.  That was almost like an internal consultancy job. The only difference was we were free, which you would think would make it easier to sell our services. But the opposite is the case.

Finally, in my last Air Force job, I was part of the engineering authority looking after the Typhoon aircraft as it came into service, which is always a fun time. We just got the plane into service. And then one of the boxes that I was responsible for malfunctioned. So the undercarriage refused to come down on the plane, which is not what you want. We did it did get down safely in the end, but then the whole fleet was grounded and we had to fix the problem. So some more excitement there. Not always of the kind that you want, but there we go. So that took me up to 2006.

At that point, I transitioned out of the Air Force and I became a consultant

So, I always regarded consultants with a bit of suspicion up until then, and now I am one. I started off with a firm called QinetiQ, which is also over here. And I was doing safety mainly with the aviation team. But again, we did all sorts, vehicles, ships, network logistics stuff, all kinds of things. And then in 2012, I joined Frazer-Nash in order to come to Australia.

So we appeared in Australia in November 2012. And we’ve been here in Adelaide all almost all that time. And you can’t get rid of us now because we’re citizens. So you’re stuck with us. But it’s been lovely. We love Adelaide and really enjoy, again, the varied work here.

Adelaide CBD, photo by Simon Di Nucci

Part 2 will follow next week!

New to System Safety? Then start here. There’s more about The Safety Artisan here. Subscribe for free regular emails here.

Categories
Mil-Std-882E Safety Analysis System Safety

How to Understand Safety Standards

Learn How to Understand Safety Standards with this FREE session from The Safety Artisan.

In this module, Understanding Your Standard, we’re going to ask the question: Am I Doing the Right Thing, and am I Doing it Right? Standards are commonly used for many reasons. We need to understand our chosen system safety engineering standard, in order to know: the concepts, upon which it is based; what it was designed to do, why and for whom; which kinds of risk it addresses; what kinds of evidence it produces; and it’s advantages and disadvantages.

Understand Safety Standards : You’ll Learn to

  • List the hazard analysis tasks that make up a program; and
  • Describe the key attributes of Mil-Std-882E. 
Understanding Your Standard

Topics:  Understand Safety Standards

Aim: Am I Doing the Right Thing, and am I Doing it Right?

  • Standards: What and Why?
  • System Safety Engineering pedigree;
  • Advantages – systematic, comprehensive, etc:
  • Disadvantages – cost/schedule, complexity & quantity not quality.

Transcript: Understand Safety Standards

Click here for the Transcript on Understanding Safety Standards

In Module Three, we’re going to understand our Standard. The standard is the thing that we’re going to use to achieve things – the tool. And that’s important because tools designed to do certain things usually perform well. But they don’t always perform well on other things. So we’re going to ask ‘Are we doing the right thing?’ And ‘Are we doing it right?’

What and Why?

So, what are we going to do, and why are we doing it? First of all, the use of standards in safety is very common for lots of reasons. It helps us to have confidence that what we’re doing is good enough. We’ve met a standard of performance in the absolute sense. It helps us to say, ‘We’ve achieved standardization or commonality in what we’re doing’. And we can also use it to help us achieve a compromise. That can be a compromise across different stakeholders or across different organizations. And standardization gives us some of the other benefits as well. If we’re all doing the same thing rather than we’re all doing different things, it makes it easier to train staff. This is one example of how a standard helps.

However, we need to understand this tool that we’re going to use. What it does, what it’s designed to do, and what it is not designed to do. That’s important for any standard or any tool. In safety, it’s particularly important because safety is in many respects intangible. This is because we’re always looking to prevent a future problem from occurring. In the present, it’s a little bit abstract. It’s a bit intangible. So, we need to make sure that in concept what we’re doing makes sense and is coherent. That it works together. If we look at those five bullet points there, we need to understand the concept of each standard. We need to understand the basis of each one.

And they’re not all based on the same concept. Thus some of them are contradictory or incompatible. We need to understand the design of the standard. What the standard does, what the aim of the standard is, why it came into existence. And who brought it into existence. To do what for who – who’s the ultimate customer here?

And for risk analysis standards, we need to understand what kind of risks it addresses. Because the way you treat a financial risk might be very different from a safety risk. In the world of finance, you might have a portfolio of products, like loans. These products might have some risks associated with them. One or two loans might go bad and you might lose money on those. But as long as the whole portfolio is making money that might be acceptable to you. You might say, ‘I’m not worried about that 10% of my loans have gone south and all gone wrong. I’m still making plenty of profit out of the other 90%’. It doesn’t work that way with safety. You can’t say ‘It’s OK that I’ve killed a few people over here because all this a lot over here are still alive!’. It doesn’t work like that!

Also, what kind of evidence does the standard produce? Because in safety, we are very often working in a legal framework that requires us to do certain things. It requires us to achieve a certain level of safety and prove that we have done so. So, we need certain kinds of evidence. In different jurisdictions and different industries, some evidence is acceptable. Some are not. You need to know which is for your area.

And then finally, let’s think about the pros and cons of the standard, what does it do well? And what does it do not so well?

System Safety Pedigree

We’re going to look at a standard called Military Standard 882E. Many decades ago, this standard developed was created by the US government and military to help them bring into service complex-cutting edge military equipment. Equipment that was always on the cutting edge. That pushed the limits of what you could achieve in performance.

That’s a lot of complexity. Lots of critical weapon systems, and so forth. And they needed something that could cope with all that complexity. It’s a system safety engineering standard. It’s used by engineers, but also by many other specialists. As I said, it’s got a background from military systems. These days you find these principles used pretty much everywhere. So, all the approaches to System Safety that 882 introduced are in other standards. They are also in other countries.

It addresses risks to people, equipment, and the environment, as we heard earlier. And because it’s an American standard, it’s about system safety. It’s very much about identifying requirements. What do we need to happen to get safety? To do that, it produces lots of requirements. It performs analyses in all those requirements and generates further requirements. And it produces requirements for test evidence. We then need to fulfill these requirements. It’s got several important advantages and disadvantages. We’re going to discuss these in the next few slides.

Comprehensive Analysis

Before we get to that, we need to look at the key feature of this standard. The strengths and weaknesses of this standard come from its comprehensive analysis. And the chart (see the slide) is meant to show how we are looking at the system from lots of different perspectives. (It’s not meant to be some arcane religious symbol!) So, we’re looking at a system from 10 different perspectives, in 10 different ways.

Going around clockwise, we’ve got these ten different hazard analysis tasks. First of all, we start off with preliminary hazard identification. Then preliminary hazard analysis. We do some system requirements hazard analysis. So, we identify the safety requirements that the system is going to meet so that we are safe. We look at subsystem and system hazard analysis. At operating and support hazard analysis – people working with the system. Number seven, we look at health hazard analysis – Can the system cause health problems for people? Functional hazard analysis, which is all about what it does. We’re thinking of sort of source software and data-driven functionality. Maybe there’s no physical system, but it does stuff. It delivers benefits or risks. System of systems hazard analysis – we could have lots of different and/or complex systems interacting. And then finally, the tenth one – environmental hazard analysis.

If we use all these perspectives to examine the system, we get a comprehensive analysis of the system. From this analysis, we should be confident that we have identified everything we need to. All the hazards and all the safety requirements that we need to identify. Then we can confidently deliver an appropriate safe system. We can do this even if the system is extremely complex. The standard is designed to deal with big, complex cutting-edge systems.

Advantages #1

In fact, as we move on to advantages, that’s the number one advantage of this standard. If we use it and we use all 10 of those tasks, we can cope with the largest and the most demanding programs. I spent much of my career working on the Eurofighter Typhoon. It was a multi-billion-dollar program. It cost hundreds of billions of dollars, four different nations worked together on it. We used a derivative of Mil. Standard 882 to look at safety and analyze it. And it coped. It was powerful enough to deal with that gigantic program. I spent 13 years of my life on and off on that program so I’d like to think that I know my stuff when we’re talking about this.

As we’ve already said, it’s a systematic approach to safety. Systems, safety, engineering. And we can start very early. We can start with early requirements – discovery. We don’t even need a design – we know that we have a need. So we can think about those needs and analyze them.

And it can cover us right through until final disposal. And it covers all kinds of elements that you might find in a system. Remember our definition of ‘system’? It’s something that consists of hardware, software, data, human beings, etc. The standard can cope with all the elements of a system. In fact, it’s designed into the standard. It was specifically designed to look at all those different elements. Then to get different insights from those elements. It’s designed to get that comprehensive coverage. It’s really good at what it does. And it involves, not just engineers, but people from all kinds of other disciplines. Including operators, maintainers, etc, etc.

I came from a maintenance background. I was either directly or indirectly supporting operators. I was responsible for trying to help them get the best out of their system. Again, that’s a very familiar world to me. And rigorous standards like this can help us to think rigorously about what we’re doing. And so get results even in the presence of great complexity, which is not always a given, I must say.

So, we can be confident by applying the standard. We know that we’re going to get a comprehensive and thorough analysis. This assures us that what we’re doing is good.

Advantages #2

So, there’s another set of advantages. I’ve already mentioned that we get assurance. Assurance is ‘justified confidence’. So we can have high confidence that all reasonably foreseeable hazards will be identified and analyzed. And if you’re in a legal jurisdiction where you are required to hit a target, this is going to help you hit that target.

The standard was also designed for use in contracts. It’s designed to be applied to big programs. We’d define that as where we are doing the development of complex high-performance systems. So, there are a lot of risks. It’s designed to cope with those risks.

Finally, the standard also includes requirements for contracting, for interfaces with other systems, for interfaces with systems engineering. This is very important for a variety of disciplines. It’s important for other engineering and technical disciplines. It’s important for non-technical disciplines and for analysis and recordkeeping. Again, all these things are important, whether it is for legal reasons or not. We need to do recordkeeping. We need to liaise with other people and consult with them. There are legal requirements for that in many countries. This standard is going to help us do all those things.

But, of course, in a standard everything has pros and cons and Mil. Standard 882 is no exception. So, let’s look at some of the disadvantages.

Disadvantages #1

First of all, a full system safety program might be overkill for the system that you want to use, or that you want to analyze.  The Cold War, thank goodness, is over; generally speaking, we’re not in the business of developing cutting-edge high-performance killing machines that cost billions and billions of dollars and are very, very risky. These days, we tend to reduce program risk and cost by using off-the-shelf stuff and modifying it. Whether that be for military systems, infrastructure in the chemical industry, transportation, whatever it might be. Very much these days we have a family of products and we reuse them in different ways. We mix and match to get the results that we want.

And of course, all this comprehensive analysis is not cheap and it’s not quick. It may be that you’ve got a program that is schedule-constrained. Or you want to constrain the cost and you cannot afford the time and money to throw a full 882 program at it. So, that’s a disadvantage.

The second family of problems is that these kinds of safety standards have often been applied prescriptively. The customer would often say, ‘Go away and go and do this. I’m going to tell you what to do based on what I think reduces my risk’. Or at least it covers their backside. So, contractors got used to being told to do certain things by purchasers and customers. The customers didn’t understand the standards that they were applying and insisting upon. So, the customers did not understand how to tailor a safety standard to get the result that they wanted. So they asked for dumb things or things that didn’t add value. And the contractors got used to working in that kind of environment. They got used to being told what to do and doing it because they wouldn’t get paid if they didn’t. So, you can’t really blame them.

But that’s not great, OK? That can result in poor behaviors. You can waste a lot of time and money doing stuff that doesn’t actually add value. And everybody recognizes that it doesn’t add value. So you end up bringing the whole safety program into disrepute and people treat it cynically. They treat it as a box-ticking exercise. They don’t apply creativity and imagination to it. Much less determination and persistence. And that’s what you need for a good effective system safety program. You need creativity. You need imagination. You need people to be persistent and dedicated to doing a good job. You need that rigor so that you can have the confidence that you’re doing a good job because it’s intangible.

Disadvantages #2

Let’s move onto the second kind of family of disadvantages. And this is the one that I’ve seen the most, actually, in the real world. If you do all 10 tasks and even if you don’t do all 10, you can create too many hazards. If you recall the graphic from earlier, we have 10 tasks. Each task looks at the system from a different angle. What you can get is lots and lots of duplication in hazard identification. You can have essentially the same hazards identified over and over again in each task. And there’s a problem with that, in two ways.

First of all, quality suffers. We end up with a fragmented picture of hazards. We end up with lots and lots of hazards in the hazard log, but not only that. We get fragments of hazards rather than the real thing. Remember I said those tests for what a hazard really is? Very often you can get causes masquerading as hazards. Or other things that that exacerbating factors that make things worse. They’re not a hazard in their own right, but they get recorded as hazards. And that problem results in people being unable to see the big picture of risk. So that undermines what we’re trying to do. And as I say, we get lots of things misidentified and thrown into the pot. This also distracts people. You end up putting effort into managing things that don’t make a difference to safety. They don’t need to be managed. Those are the quality problems.

And then there are quantity problems. And from personal experience, having too many hazards is a problem in itself.  I’ve worked on large programs where we were managing 250 hazards or thereabouts. That is challenging even with a sizable, dedicated team. That is a lot of work in trying to manage that number of hazards effectively. And there’s always the danger that it will slide into becoming a box-ticking exercise. Superficial at best.

I’ve also seen projects that have two and a half thousand hazards or even 4000 hazards in the hazard log. Now, once you get up to that level, that is completely unmanageable. People who have thousands of hazards in a hazard log and they think they’re managing safety are kidding themselves. They don’t understand what safety is if they think that’s going to work. So, you end up with all these items in your hazard log, which become a massive administrative burden. So people end up taking shortcuts and the real hazards are lost. The real issues that you want to focus on are lost in the sea of detail that nobody will ever understand. You won’t be able to control them.

Unfortunately, Mil. Standard 882 is good at generating these grotesque numbers of hazards. If you don’t know how to use the standard and don’t actively manage this issue, it gets to this stage. It can go and does go, badly wrong. This is particularly true on very big programs. And you really need clarity on big projects.

Summary of Module

Let’s summarize what we’ve done with this module. The aim was to help us understand whether we’re doing the right thing and whether we’ve done it right. And standards are terrific for helping us to do that. They help us to ensure we’re doing the right thing. That we’re looking at the right things. And they help us to ensure that we’re doing it rigorously and repeatedly. All the good quality things that we want. And Mil. Standard 882E that we’re looking at is a system safety engineering standard. So it’s designed to deal with complexity and high-performance and high-risk. And it’s got a great pedigree. It’s been around for a long time.

Now that gives advantages. So, we have a system safety program with this standard that helps us to deal with complexity. That can cope with big programs, with lots of risks. That’s great.

The disadvantages of this standard are that if we don’t know how to tailor or manage it properly, it can cost a lot of money. It can take a lot of time to give results which can cause problems for the program. And ultimately, you can accidentally ignore safety if you don’t deliver on time. And it can generate complexity. And it can generate a quantity of data that is so great that it actually undermines the quality of the data. It undermines what we’re trying to achieve. In that, we get a fragmented picture in which we can’t see the true risks. And so we can’t manage them effectively. If we get it wrong with this standard, we can get it really wrong. And that brings us to the end of this module.

This is Module 3 of SSRAP

This is Module 3 from the System Safety Risk Assessment Program (SSRAP) Course. Risk Analysis Programs – Design a System Safety Program for any system in any application. You can access the full course here.

You can find more introductory lessons at Start Here.

Categories
Human Factors System Safety

Introduction to Human Factors

In this 40-minute video, ‘Introduction to Human Factors’, I am very pleased to welcome Peter Benda to The Safety Artisan.

Peter is a colleague and Human Factors specialist, who has 23 years’ experience in applying Human Factors to large projects in all kinds of domains. In this session we look at some fundamentals: what does Human Factors engineering aim to achieve? Why do it? And what sort of tools and techniques are useful? As this is The Safety Artisan, we also discuss some real-world examples of how erroneous human actions can contribute to accidents, and how Human Factors discipline can help to prevent them.

Topics: Introduction to Human Factors

  • Introducing Peter;
  • The Joint Optimization Of Human-Machine Systems;
  • So why do it (HF)?
  • Introduction to Human Factors;
  • Definitions of Human Factors;
  • The Long Arm of Human Factors;
  • What is Human Factors Integration? and
  • More HF sessions to come…

Transcript: Introduction to Human Factors

Click Here for the Transcript

Transcript: Intro to Human Factors

Introduction

Simon:  Hello, everyone, and welcome to the Safety Artisan: Home of Safety Engineering Training. I’m Simon and I’m your host, as always. But today we are going to be joined by a guest, a Human Factors specialist, a colleague, and a friend of mine called Peter Benda. Now, Peter started as one of us, an ordinary engineer, but unusually, perhaps for an engineer, he decided he didn’t like engineering without people in it. He liked the social aspects and the human aspects and so he began to specialize in that area. And today, after twenty-three years in the business, and first degree and a master’s degree in engineering with a Human Factors speciality. He’s going to join us and share his expertise with us.

So that’s how you got into it then, Peter. For those of us who aren’t really familiar with Human Factors, how would you describe it to a beginner?

Peter:   Well, I would say it’s The Joint Optimization Of Human-Machine Systems. So it’s really focusing on designing systems, perhaps help holistically would be a term that could be used, where we’re looking at optimizing the human element as well as the machine element. And the interaction between the two. So that’s really the key to Human Factors. And, of course, there are many dimensions from there; environmental, organizational, job factors, human and individual characteristics. All of these influence behaviour at work and health and safety. Another way to think about it is the application of scientific information concerning humans to the design of systems. Systems are for human use, which I think most systems are.

Simon:  Indeed. Otherwise, why would humans build them?

Peter:   That’s right. Generally speaking, sure.

Simon:  So, given that this is a thing that people do then. Perhaps we’re not so good at including the human unless we think about it specifically?

Peter:   I think that’s fairly accurate. I would say that if you look across industries, and industries are perhaps better at integrating Human Factors, considerations or Human Factors into the design lifecycle, that they have had to do so because of the accidents that have occurred in the past. You could probably say this about safety engineering as well, right?

Simon:  And this is true, yes.

Peter:   In a sense, you do it because you have to because the implications of not doing it are quite significant. However, I would say the upshot, if you look at some of the evidence –and you see this also across software design and non-safety critical industries or systems –that taking into account human considerations early in the design process typically ends up in better system performance. You might have more usable systems, for example. Apple would be an example of a company that puts a lot of focus into human-computer interaction and optimizing the interface between humans and their technologies and ensuring that you can walk up and use it fairly easily. Now as time goes on, one can argue how out how well Apple is doing something like that, but they were certainly very well known for taking that approach.

Simon:  And reaped the benefits accordingly and became, I think, they were the world’s number one company for a while.

Peter:   That’s right. That’s right.

Simon:  So, thinking about the, “So why do it?” What is one of the benefits of doing Human Factors well?

Peter:   Multiple benefits, I would say. Clearly, safety and safety-critical systems, like health and safety; Performance, so system performance; Efficiency and so forth. Job satisfaction and that has repercussions that go back into, broadly speaking, that society. If you have meaningful work that has other repercussions and that’s sort of the angle I originally came into all of this from. But, you know, you could be looking at just the safety and efficiency aspects.

Simon:  You mentioned meaningful work: is that what attracted you to it?

Peter:   Absolutely. Absolutely. Yes. Yes, like I said I had a keen interest in the sociology of work and looking at work organization. Then, for my master’s degree, I looked at lean production, which is the Toyota approach to producing vehicles. I looked at multiskilled teams and multiskilling and job satisfaction. Then looking at stress indicators and so forth versus mass production systems. So that’s really the angle I came into this. If you look at it, mass production lines where a person is doing the same job over and over, it’s quite repetitive and very narrow, versus the more Japanese style lean production. There are certainly repercussions, both socially and individually, from a psychological health perspective.

Simon:  So, you get happy workers and more contented workers-

Peter:   –And better quality, yeah.

Simon:  And again, you mentioned Toyota. Another giant company that’s presumably grown partly through applying these principles.

Peter:   Well, they’re famous for quality, aren’t they? Famous for reliable, high-quality cars that go on forever. I mean, when I moved from Canada to Australia, Toyota has a very, very strong history here with the Land Cruiser, and the high locks, and so forth.

Simon:  All very well-known brands here. Household names.

Peter:   Are known to be bombproof and can outlast any other vehicle. And the lean production system certainly has, I would say, quite a bit of responsibility for the production of these high-quality cars.

Simon:  So, we’ve spoken about how you got into it and “What is it?” and “Why do it?” I suppose, as we’ve said, what it is in very general terms but I suspect a lot of people listening will want to know to define what it is, what Human Factors is, based on doing it. On how you do it. It’s a long, long time since I did my Human Factors training. Just one module in my masters, so could you take me through what Human Factors involves these days in broad terms.

Peter:   Sure, I actually have a few slides that might be useful –  

Simon:  – Oh terrific! –

Peter:   –maybe I should present that. So, let me see how well I can share this. And of course, sometimes the problem is I’ll make sure that – maybe screen two is the best way to share it. Can you see that OK?

Simon:  Yeah, that’s great.

Introduction to Human Factors

Peter:   Intro to Human Factor. So, as Stewart Dickinson, who I work with at human risk solutions and I have prepared some material for some courses we taught to industry. I’ve some other material and I’ll just flip to some of the key slides going through “What is Human Factors”. So, let me try to get this working and I’ll just flip through quickly.

Definitions of Human Factors

Peter:   So, as I’ve mentioned already, broadly speaking, environmental, organizational, and job factors, and human individual characteristics which influence behaviour at work in a way that can which can affect health and safety. That’s a focus of Human Factors. Or the application of scientific information concerning humans to the design of objects, systems and environments for human use. You see a pattern here, fitting work to the worker. The term ergonomics is used interchangeably with Human Factors. It also depends on the country you learn this in or applied in.

Simon:  Yes. In the U.K., I would be used to using the term ergonomics to describe something much narrower than Human Factors but in Australia, we seem to use the two terms as though they are the same.

Peter:   It does vary. You can say physical ergonomics and I think that would typically represent when people think of ergonomics, they think of the workstation design. So, sitting at their desk, heights of tables or desks, and reach, and so on. And particularly given the COVID situation, there are so many people sitting at their desks are probably getting some repetitive strain –

Simon:  –As we are now in our COVID 19 [wo]man caves.

Peter:   That’s right! So that’s certainly an aspect of Human Factors work because that’s looking at the interaction between the human and the desk/workstation system, so to speak, on a very physical level.        

            But of course, you have cognitive ergonomics as well, which looks of perceptual and cognitive aspects of that work. So Human Factors or ergonomics, broadly speaking, would be looking at these multi-dimensional facets of human interaction with systems.

Definitions of Human Factors (2)

Peter:   Some other examples might be the application of knowledge of human capabilities and limitations to design, operation and maintenance of technological systems, and I’ve got a little distilled –or summarized- bit on the right here. The Human Factors apply scientific knowledge to the development and management of the interfaces between humans and rail systems. So, this is obviously in the rail context so you’re, broadly speaking, talking in terms of technological systems. That covers all of the people issues. We need to consider to assure safe and effective systems or organizations.

Again, this is very broad. Engineers often don’t like these broad topics or broad approaches. I’m an engineer, I learned this through engineering which is a bit different than how some people get into Human Factors.

Simon:  Yeah, I’ve met a lot of human factor specialists who come in from a first degree in psychology.

Peter:   That’s right. I’d say that’s fairly common, particularly in Australia and the UK. Although, I know that you could take it here in Australia in some of the engineering schools, but it’s fairly rare. There’s an aviation Human Factors program, I think, at Swinburne University. They used to teach it through mechanical engineering there as well. I did a bit of teaching into that and I’m not across all of the universities in Australia, but there are a few. I think the University of the Sunshine Coast has quite a significant group at the moment that’s come from, or, had some connection to Monash before that. Well, I think about, when I’m doing this work, of “What existing evidence do we have?” Or existing knowledge base with respect to the human interactions with the system. For example, working with a rail transport operator, they will already have a history of incidents or history of issues and we’d be looking to improve perhaps performance or reduce the risk associated with the use of certain systems. Really focusing on some of the evidence that exists either already in the organization or that’s out there in the public domain, through research papers and studies and accident analyses and so forth. I think much like safety engineering, there would be some or quite a few similarities in terms of the evidence base –

Simon:  – Indeed.

Peter:   – Or creating that evidence through analysis. So, using some analytical techniques, various Human Factors methods and that’s where Human Factors sort of comes into its own. It’s a suite of methods that are very different from what you would find in other disciplines.

Simon:  Sure, sure. So, can you give us an overview of these methods, Peter?

Peter:   There are trying to think of a slide for this. Hopefully, I do.

Simon:  Oh, sorry. Have I taken you out of sequence?

Peter:   No, no. Not out of sequence. Let me just flip through, and take a look at –

The Long Arm of Human Factors

Peter:   This is probably a good sort of overview of the span of Human Factors, and then we can talk about the sorts of methods that are used for each of these – let’s call them –dimensions. So, we have what’s called the long arm of Human Factors. It’s a large range of activities from the very sort of, as we’re talking about, physical ergonomics, e.g. sitting at a desk and so on, manual handling, workplace design, and moving to interface design with respect to human-machine interfaces- HMIs, as they’re called, or user interfaces. There are techniques, manual handling techniques and analysis techniques – You might be using something like a task analysis combined with a NIOSH lifting equation and so on. Workplace design, you’d be looking at anthropocentric data. So, you would have a dataset that’s hopefully representative of the population you’re designing for, and you may have quite specific populations. So Human Factors, engineering is fairly extensively used, I would say, in military projects –in the military context-

Simon:  – Yes.

Peter:   – And there’s this set of standards, the Mil standard, 1472G, for example, from the United States. It’s a great example that gives not only manual handling standards or guidelines, workplace design guidelines in the workplace, in a military sense, can be a vehicle or on a ship and so on. Or on a base and so forth.

Interface design- So, if you’re looking at from a methods perspective, you might have usability evaluations, for example. You might do workload’s studies and so forth, looking at how well the interface supports particular tasks or achieving certain goals.

            Human error –There are human error methods that typically leverage off of task models. So, you’d have a task model and you would look at for that particular task, what sorts of errors could occur and the structured methods for that?

Simon:  Yes, I remember human task analysis –seeing colleagues use that on a project I was working on. It seemed quite powerful for capturing these things.

Peter:   It is and you have to pragmatically choose the level of analysis because you could go down to a very granular level of detail. But that may not be useful, depending on the sort of system design you’re doing, the amount of money you have, and how critical the task is. So, you might have a significantly safety-critical task, and that might need quite a detailed analysis. An example there would be – there was a … I think it’s the … You can look up the accident analysis online, I believe it’s the Virgin Galactic test flight. So this is one of these test flights in the U.S. – I have somewhere in my archive of accident analyses – where the FAA had approved the test flights to go ahead and there was a task where – I hope I don’t get this completely wrong – where one of the pilots (there are two pilots, a pilot and a co-pilot) and this test aeroplane where they had to go into high-altitude in this near-space vehicle. They were moving at quite a high speed and there was a particular task where they had to do something with – I think they had to slow down and then you could … slow down their aeroplane, I guess, by reducing the throttle and then at a certain point/a certain speed, you could deploy, or control, the ailerons or some such, wing-based device, and the task order was very important. And what had happened was a pilot or the co-pilot had performed the task slightly out of order. As a matter of doing one thing first before they did another thing that led to the plane breaking up. And fortunately, one of the pilots survived, unfortunately, one didn’t.

Simon:  So, very severe results from making a relatively small mistake.

Peter:   So that’s a task order error, which is very easy to do. And if the system had been designed in a way to prevent that sort of capability to execute that action at that point. That would have been a safer design. At that level, you might be going down to that level of analysis and kind of you get called keystroke level analysis and so on

Simon:  – Where it’s justified, yes.

Peter:   Task analysis is, I think, probably one of the most common tools used. You also have workload analysis, so looking at, for example, interface design. I know some of the projects we were working on together, Simon, workload was a consideration. There are different ways to measure workload. There’s a NASA TLX, which is a subjective workload. Questionnaire essentially, that’s done post-task but it’s been shown to be quite reliable and valid as well. So, that instrument is used and there are a few others that are used. It depends on the sort of study you’re doing, the amount of time you have and so forth. Let me think, that’s workload analysis.

Safety culture- I wouldn’t say that’s my forte. I’ve done a bit of work on safety culture, but that’s more organizational and the methods there tend to be more around culpability models and implementing those into the organizational culture.

Simon:  So, more governance type issues? That type of thing?

Peter:   Yes. Governance and – whoops! Sorry, I didn’t mean to do that. I’m just looking at the systems and procedure design. The ‘e’ is white so it looks like it’s a misspelling there. So it’s annoying me …

Simon:  – No problem!

Peter:   Yes. So, there are models I’ve worked with at organization such as some rail organizations where they look at governance, but also in terms of appropriate interventions. So, if there’s an incident, what sort of intervention is appropriate? So, essentially use sort of a model of culpability and human error and then overlay that or use that as a lens upon which to analyse the incident. Then appropriately either train employees or management and so on. Or perhaps it was a form of violation, a willful violation, as it may be –

Simon:  – Of procedure?

Peter:   Yeah, of procedure and so on versus a human error that was encouraged by the system’s design. So, you shouldn’t be punishing, let’s say, a train driver for a SPAD if the –

Simon:  – Sorry, that’s a Signal Passed At Danger, isn’t it?

Peter:   That’s right. Signal Passed At Danger. So, it’s certainly possible that the way the signalling is set up leads to a higher chance of human error. You might have multiple signals at a location and it’s confusing to figure out which one to attend to and you may misread and then you end up SPADing and so on. So, there are, for example, clusters of SPADs that will be analysed and then the appropriate analysis will be done. And you wouldn’t want to be punishing drivers if it seemed to be a systems design issue.

Simon:  Yes. I saw a vivid illustration of that on the news, I think, last night. There was a news article where there was an air crash that tragically killed three people a few months ago here in South Australia. And the newsies report today is saying it was human error but when they actually got in to reporting what had happened, it was pointed out that the pilot being tested was doing – It was a twin-engine aeroplane and they were doing an engine failure after take-off drill. And the accident report said that the procedure that they were using allowed them to do that engine failure drill at too low an altitude. So, if the pilot failed to take the correct action very quickly – bearing in mind this is a pilot being tested because they are undergoing training – there was no time to recover. So, therefore, the aircraft crashed. So, I thought, ”Well, it’s a little bit unfair just to say it’s a human error when they were doing something that was in intrinsically inappropriate for a person of that skill level.”

Peter:   That’s an excellent example and you hear this in the news a lot. Human error, human error and human error. The cause of this, I think, with the recent Boeing problems with the flight control system for the new 737s. And of course, there will be reports. Some of the interim reports already talk about some of these Human Factors, issues inherent in that, and I would encourage people to look up the publicly available documentation on that-

Simon:  – This is the Boeing 737 Max accidents in Indonesia and in Ethiopia, I think.

Peter:   That’s correct. That’s correct. Yes, absolutely. And pilot error was used as the general explanation but under further analysis, you started looking at that error. That so to speak error perhaps has other causes which are systems design causes, perhaps. So these things are being investigated but have been written about quite extensively. And you can look at, of course, any number of aeroplane accidents and so on. There’s a famous Air France one flying from Brazil to Paris, from what I recall. It might have been Rio de Janeiro to Paris. Where the pitot –

Simon:  – Yeah, pitot probes got iced up.

Peter:    Probes, they iced up and it was dark. So, the pilots didn’t have any ability to gauge by looking outside. I believe it was dark or it might have been a storm. There’s some difficulty in engaging what was going on outside of the aeroplane and there again misreads. So, stall alarms going off and so off, I believe. There were some mis-readings on the airspeed coming from the sensors, essentially. And then the pilots acted according to that information, but that information was incorrect. So, you could say there were probably a cascade of issues that occurred there and there’s a fairly good analysis one can look up that looks at the design. I believe it was an Airbus. It was the design of the Airbus. So, we had one pilot providing an input in one direction to the control yoke and the other pilot in the other direction. There are a number of things that broke down. And typically, you’ll see this in accidents. You’ll have a cascade as they’re trying to troubleshoot and can’t figure out what’s going on they’ll start applying various approaches to try and remedy the situation and people begin to panic and so on.

            And you have training techniques, a crew resource management, which certainly has a strong Human Factors element or comes out of the Human Factors world, which looks at how to have teams and cockpits. And in other situations working effectively in emergency situations And that’s sort of after analysing, of course, failures.

Simon:  Yes, and I think CRM, crew resource management, has been adopted not just in the airline industry, but in many other places as well, hasn’t it?

Peter:   Operating theatres, for example. There’s quite a bit of work in the 90s that started with I think it was David Gaba who I think was at Stanford – this is all from memory. That then look at operating theatres. In fact, the Monash Medical Centre in Clayton had a simulation centre for operating theatres where they were applying these techniques to training operating theatre personnel. So, surgeons, anaesthetists, nurses and so forth.

Simon:  Well, thanks, Peter. I think and I’m sorry, I think I hijacked you’ll the presentation, but –

Peter:   It’s not really a presentation anyway. It was more a sort of better guidance there. We’re talking about methods, weren’t we? And it’s easy to go then from methods to talking about accidents. Because then we talk about the application of some of these methods or if these methods are applied to prevent accidents from occurring.

Simon:  Cool. Well, thanks very much, Peter. I think maybe I’ll let the next time we have a chat I’ll let you talk through your slides and we’ll have a more in-depth look across the whole breadth of Human Factors.

Peter:   So that’s probably a good little intro at the moment anyway. Perhaps I might pull up one slide on Human Factors integration before we end.

Simon:  Of course.

Peter:   I’ll go back a few slides here.

What is Human Factors Integration?

Peter:   And so what is Human Factors integration? I was thinking about this quite a bit recently because I’m working on some complex projects that are very, well, not only complex but quite large engineering projects with lots of people, lots of different groups involved, different contracts and so forth. And the integration issues that occur. They’re not only Human Factors integration issues there are larger-scale integration issues, engineering integration issues. Generally speaking, this is something I think that projects often struggle with. And I was really thinking about the Human Factors angle and Human Factors integration. That’s about ensuring that all of the HF issues, so HF in Human Factors, in a project are considered in control throughout the project and deliver the desired performance and safety improvements. So, three functions of Human Factors integration

  • confirm the intendant system performance objectives and criteria
  • guide and manage the Human Factors, aspects and design cycles so that negative aspects don’t arise and prevent the system reaching its optimum performance level
  • and identify and evaluate any additional Human Factors safety aspect now or we found in the safety case.

You’ll find, particularly in these complex projects, that the interfaces between the –  you might have quite a large project and have some projects working on particular components. Let’s say one is working on more of a civil/structural elements and maybe space provisioning and so on versus another one is working more on control systems. And the integration between those becomes quite difficult because you don’t really have that Human Factors integration function working to integrate those two large components. Typically, it’s within those focused project groupings –that’s the way to call them. Does that make sense?

Simon:  Yeah. Yeah, absolutely.

Peter:   I think that’s one of the big challenges that I’m seeing at the moment, is where you have a certain amount of time and money and resource. This would be common for other engineering disciplines and the integration work often falls by the wayside, I think. And that’s where I think a number of the ongoing Human Factors issues are going to be cropping up some of these large-scale projects for the next 10 to 20 years. Both operationally and perhaps safety as well. Of course, we want to avoid –

Simon:  –Yes. I mean, what you’re describing sounds very familiar to me as a safety engineer and I suspect to a lot of engineers of all disciplines who work on large projects. They’re going to recognize that as it is a familiar problem.

Peter:   Sure. You can think about if you’ve got the civil and space provisioning sort of aspect of a project and another group is doing what goes into, let’s say, a room into a control room or into a maintenance room and so on. It may be that things are constrained in such a way that the design of the racks in the room has to be done in a way that makes the work more difficult for maintainers. And it’s hard to optimize these things because these are complex projects and complex considerations. And a lot of people are involved in them. The nature of engineering work is typically to break things down into little elements, optimize those elements and bring them all together.

Simon:  –Yes.

Peter:   Human Factors tends to –Well, you can do them Human Factors as well but I would argue that certainly what attracted me to it, is that you tend to have to take a more holistic approach to human behaviour and performance in a system.

Simon:  Absolutely.

Peter:   Which is hard.

Simon:   Yes, but rewarding. And on that note, thanks very much, Peter. That’s been terrific. Very helpful. And I look forward to our next chat.

Peter:   For sure. Me too. Okay, thanks!

Simon:  Cheers!

Outro

Simon:  Well, that was our first chat with Peter on the Safety Artisan and I’m looking forward to many more. So, it just remains for me to say thanks very much for watching and supporting the work of what we’re doing and what we’re trying to achieve. I look forward to seeing you all next time. Okay, goodbye.

End: Introduction to Human Factors

Categories
Start Here System Safety

Safety Concepts Part 2

In this 33-minute session, Safety Concepts Part 2, The Safety Artisan equips you with more Safety Concepts. We look at the basic concepts of safety, risk, and hazard in order to understand how to assess and manage them. Exploring these fundamental topics provides the foundations for all other safety topics, but it doesn’t have to be complex. The basics are simple, but they need to be thoroughly understood and practiced consistently to achieve success. This video explains the issues and discusses how to achieve that success.

This is the three-minute demo of the full (33 minute) Safety Concepts, Part 2 video.

Safety Concepts Part 2: Topics

  • Risk & Harm;
  • Accident & Accident Sequence;
  • (Cause), Hazard, Consequence & Mitigation;
  • Requirements / Essence of System Safety;
  • Hazard Identification & Analysis;
  • Risk Reduction / Estimation;
  • Risk Evaluation & Acceptance;
  • Risk Management & Safety Management; and
  • Safety Case & Report.

Safety Concepts Part 2: Transcript

Click Here for the Transcript

Hi everyone, and welcome to the safety artisan where you will find professional, pragmatic, and impartial advice on safety. I’m Simon, and welcome to the show today, which is recorded on the 23rd of September 2019. Today we’re going to talk about system safety concepts. A couple of days ago I recorded a short presentation (Part 1) on this, which is also on YouTube.  Today we are going to talk about the same concepts but in much more depth.

In the short session, we took some time picking apart the definition of ‘safe’. I’m not going to duplicate that here, so please feel free to go have a look. We said that to demonstrate that something was safe, we had to show that risk had been reduced to a level that is acceptable in whatever jurisdiction we’re working in.

And in this definition, there are a couple of tests that are appropriate that the U.K., but perhaps not elsewhere. We also must meet safety requirements. And we must define Scope and bound the system that we’re talking about a Physical system or an intangible system like a computer program. We must define what we’re doing with it what it’s being used for. And within which operating environment within which context is being used.  And if we could do all those things, then we can objectively say – or claim – that the system is safe.

Topics

We’re going to talk about a lot more Topics. We’re going to talk about risk accidents. The cause has a consequence sequence. They talk about requirements and. Spoiler alert. What I consider to be the essence of system safety. And then we’ll get into talking about the process. Of demonstrating safety, hazard identification, and analysis.

Risk Reduction and estimation. Risk Evaluation. And acceptance. And then pulling it all together. Risk management safety management. And finally, reporting, making an argument that the system is safe supporting with evidence. And summarizing all of that in a written report. This is what we do, albeit in different ways and calling it different things.

Risk

Onto the first topic. Risk and harm.  Our concept of risk. It’s a combination of the likelihood and severity of harm. Generally, we’re talking about harm. To people. Death. Injury. Damage to help. Now we might also choose to consider any damage to property in the environment. That’s all good. But I’m going to concentrate on. Harm. To people. Because. Usually. That’s what we’re required to do. By the law. And there are other laws covering the environment and property sometimes. That. We’re not going to talk.  just to illustrate this point. This risk is a combination of Severity and likelihood.

We’ve got a very crude. Risk table here. With a likelihood along the top. And severity. Downside. And we might. See that by looking at the table if we have a high likelihood and high severity. Well, that’s a high risk. Whereas if we have Low Likelihood and low severity. We might say that’s a low risk. And then. In between, a combination of high and low we might say that’s medium. Now, this is a very crude and simple example. Deliberately.

You will see risk matrices like this. In. Loads of different standards. And you may be required to define your own for a specific system, there are lots of variations on this but they’re all basically. Doing this thing and we’re illustrating. How we determine the level of risk. By that combination of severity. And likely, I think a picture is worth a thousand words. Moving online to the accident. We’re talking about (in this standard) an unintended event that causes harm.

Accidents, Sequences and Consequences

Not all jurisdictions just consider accidental event some consider deliberate as well. We’ll leave that out. A good example of that is work health and safety in Australia but no doubt we’ll get to that in another video sometime. And the accident sequences the progression of events. That results in an accident that leads to an. Now we’re going to illustrate the accident sequence in a moment but before we get there. We need to think about cousins.  here we’ve got a hazard physical situation of state system. Often following some initiating event that may lead to an accident, a thing that may cause harm.

And then allied with that we have the idea of consequences. Of outcomes or an outcome. Resulting from. An. Event. Now that all sounds a bit woolly doesn’t it, let’s illustrate that. Hopefully, this will make it a lot clearer. Now. I’ve got a sequence here. We have. Causes. That might lead to a hazard. And the hazard might lead to different consequences. And that’s the accident. See. Now in this standard, they didn’t explicitly define causes.

Cause, Hazard and Consequence

They’re just called events. But most mostly we will deal with causes and consequences in system safety. And it’s probably just easier to implement it. Whether or not you choose to explicitly address every cause. That’s often option step. But this is the accident Sequence that we’re looking at. And they this sort of funnels are meant to illustrate the fact that they may be many causes for one hazard. And one has it may lead to many consequences on some of those consequences. Maybe. No harm at all.

We may not actually have an accident. We may get away with it. We may have a. Hazard. And. Know no harm may befall a human. And if we take all of this together that’s the accident sequence. Now it’s worth. Reiterating. That just because a hazard exists it does not necessarily need. Lead to harm. But. To get to harm. We must have a hazard; a hazard is both necessary and sufficient. To lead to harmful consequences. OK.

Hazards: an Example

And you can think of a hazard as an accident waiting to happen. You can think of it in lots of different ways, let’s think about an example, the hazard might be. Somebody slips. Okay well while walking and all. That slip might be caused by many things it might be a wet surface. Let’s say it’s been raining, and the pavement is slippery, or it might be icy. It might be a spillage of oil on a surface, or you’d imagine something slippery like ball bearings on a surface.

So, there’s something that’s caused the surface to become slippery. A person slips – that’s the hazard. Now the person may catch themselves; they may not fall over. They may suffer no injury at all. Or they might fall and suffer a slight injury; and, very occasionally, they might suffer a severe injury. It depends on many different factors. You can imagine if you slipped while going downstairs, you’re much more likely to be injured.

And younger, healthy, fit people are more likely to get over a fall without being injured, whereas if they’re very elderly and frail, a fall can quite often result in a broken bone. If an elderly person breaks a bone in a fall the chances of them dying within the next 12 months are quite high. They’re about one in three.

So, the level of risk is sensitive to a lot of different factors. To get an accurate picture, an accurate estimate of risk, we’re going to need to factor in all those things. But before we get to that, we’ve already said that hazard need not lead to harm. In this standard, we call it an incident, where a hazard has occurred; it could have progressed to an accident but didn’t, we call this an incident. A near miss.

We got away with it. We were lucky. Whatever you want to call it. We’ve had an incident but no he’s been hurt. Hopefully, that incident is being reported, which will help us to prevent an actual accident in future.  That’s another very useful concept that reminds us that not all hazards result in harm. Sometimes there will be no accident. There will be no harm simply because we were lucky, or because someone present took some action to prevent harm to themselves or others.

Mitigation Strategies (Controls)

But we would really like to deliberately design out or avoid Hazards if we can. What we need is a mitigation strategy, we need a measure or measures that, when we put them into practice, reduce that risk. Normally, we call these things controls. Again, now we’ve illustrated this; we’ve added to the funnels. We’ve added some mitigation strategies and they are the dark blue dashed lines.

And they are meant to represent Barriers that prevent the accident sequence progressing towards harm. And they have dashed lines because very few controls are perfect, you know everything’s got holes in it. And we might have several of them. But usually, no control will cover all possible causes; and very few controls will deal with all possible consequences.  That’s what those barriers are meant to illustrate.

That idea that picture will be very useful to us later. When we are thinking about how we’re going to estimate and evaluate risk overall and what risk reduction we have achieved. And how we talk about justifying what we’ve done is good. That’s a very powerful illustration. Well, let’s move on to safety requirements.

Safety Requirements

Now. I guess it’s no great surprise to say that requirements, once met, can contribute directly to the safety of the system. Maybe we’ve got a safety requirement that says all cars will be fitted with seatbelts. Let’s say we’ll be required to wear a seatbelt.  That makes the system safer.

Or the requirement might be saying we need to provide evidence of the safety of the system. And, the requirement might refer to a process that we’ve got to go through or a set kind of evidence that we’ve got to provide. Safety requirements can cover either or both of these.

The Essence of System Safety

Requirements. Covering. Safety of the system or demonstrating that the system is safe. Should give us assurance, which is adequate confidence or justified confidence. Supported with evidence by following a process. And we’ll talk more about process. We meet safety requirements. We get assurance that we’ve done the right thing. And this really brings us to the essence of what system safety is, we’ve got all these requirements – everything is a requirement really – including the requirement. To demonstrate risk reduction.

And those requirements may apply to the system itself, the product. Or they may provide, or they may apply to the process that generates the evidence or the evidence. Putting all those things together in an organized and orderly way really is the essence of system safety, this is where we are addressing safety in a systematic way, in an orderly way. In an organized way. (Those words will keep coming back). That’s the essence of system safety, as opposed to the day-to-day task of keeping a workplace safe.

Maybe by mopping up spills and providing handrails, so people don’t slip over. Things like that. We’re talking about a more sophisticated level of safety. Because we have a more complex problem a more challenging problem to deal with. That’s system safety. We will start on the process now, and we begin with hazard identification and analysis; first, we need to identify and list the hazards, the Hazards and the accidents associated with the system.

We’ve got a system, physical or not. What could go wrong? We need to think about all the possibilities. And then having identified some hazards we need to start doing some analysis, we follow a process. That helps us to delve into the detail of those hazards and accidents. And to define and understand the accident sequences that could result. In fact, in doing the analysis we will very often identify some more hazards that we hadn’t thought of before, it’s not a straight-through process it tends to be an iterative process.

Risk Reduction

And what ultimately what we’re trying to do is reduce risk, we want a systematic process, which is what we’re describing now. A systematic process of reducing risk. And at some point, we must estimate the risk that we’re left with. Before and after all these controls, these mitigations, are applied. That’s risk estimation.  Again, there’s that systematic word, we’re going to use all the available information to estimate the level of risk that we’ve got left. Recalling that risk is a combination of severity and likelihood.

Now as we get towards the end of the process, we need to evaluate risk against set criteria. And those criteria vary depending on which country you’re operating in or which industry we’re in: what regulations apply and what good practice is relevant. All those things can be a factor. Now, in this case, this is a U.K. standard, so we’ve got two tests for evaluating risk. It’s a systematic determination using all the available evidence. And it should be an objective evaluation as far as we can make it.

Risk Evaluation

We should use certain criteria on whether a risk can be accepted or not. And in the U.K. there are two tests for this. As we’ve said before, there is ALARP, the ‘As Low As is Reasonably Practicable’ test, which says: Have we put into practice all reasonably practicable controls? (To reduce risk, this is risk reduction target). And then there’s an absolute level of risk to consider as well. Because even if we’ve taken all practical measures, the risk remaining might still be so high as to be unacceptable to the law.

Now that test is specific to the U.K, so we don’t have to worry too much about it. The point is there are objective criteria, which we must test ourselves or measure ourselves against. An evaluation that will pop out the decision, as to whether a further risk reduction is necessary if the risk level is still too high. We might conclude that are still reasonably practicable measures that we could take. Then we’ve got to do it.

We have an objective decision-making process to say: have we done enough to reduce risk? And if not, we need to do some more until we get to the point where we can apply the test again and say yes, we’ve done enough. Right, that’s rather a long-winded way of explaining that. I apologize, but it is a key issue and it does trip up a lot of people.

Risk Acceptance

Now, once we’ve concluded that we’ve done enough to reduce risk and no further risk reduction is necessary, somebody should be in a position to accept that risk.  Again, it’s a systematic process, by which relevant stakeholders agree that risks may be accepted. In other words, somebody with the right authority has said yes, we’re going to go ahead with the system and put it into practice, implement it. The resulting risks to people are acceptable, providing we apply the controls.

And we accept that responsibility.  Those people who are signing off on those risks are exposing themselves and/or other people to risk. Usually, they are employees, but sometimes members of the public as well, or customers. If you’re going to put customers in an airliner you’re saying yes there is a level of risk to passengers, but that the regulator, or whoever, has deemed [the risk] to be acceptable. It’s a formal process to get those risks accepted and say yes, we can proceed. But again, that varies greatly between different countries, between different industries. Depending on what regulations and laws and practices apply. (We’ll talk about different applications in another section.)

Risk Management

Now putting all this together we call this risk management.  Again, that wonderful systematic word: a systematic application of policies, procedures and practices to these tasks. We have hazard identification, analysis, risk estimation, risk evaluation, risk reduction & risk acceptance. It’s helpful to demonstrate that we’ve got a process here, where we go through these things in order. Now, this is a simplified picture because it kind of implies that you just go through the process once.

With a complex system, you go through the process at least once. We may identify further hazards, when we get into Hazard Analysis and estimating risk. In the process of trying to do those things, even as late as applying controls and getting to risk acceptance. We may discover that we need to do additional work. We may try and apply controls and discover the controls that we thought were going to be effective are not effective.

Our evaluation of the level of risk and its acceptability is wrong because it was based on the premise that controls would be effective, and we’ve discovered that they’re not, so we must go back and redo some work. Maybe as we go through, we even discover Hazards that we hadn’t anticipated before. This can and does happen, it’s not necessarily a straight-through process. We can iterate through this process. Perhaps several times, while we are moving forward.

Safety Management

OK, Safety Management. We’ve gone to a higher level really than risk because we’re thinking about requirements as well as risk. We’re going to apply organization, we’re going to applying management principles to achieve safety with high confidence. For the first time we’ve introduced this idea of confidence in what we’re doing. Well, I say the first time, this is insurance isn’t it? Assurance, having justified confidence or appropriate confidence, because we’ve got the evidence. And that might be product evidence too we might have tested the product to show that it’s safe.

We might have analysed it. We might have said well we’ve shown that we follow the process that gives us confidence that our evidence is good. And we’ve done all the right things and identified all the risks.  That’s safety management. We need to put that in a safety management system, we’ve got a defined organization structure, we have defined processes, procedures and methods. That gives us direction and control of all the activities that we need to put together in a combination. To effectively meet safety requirements and safety policy.

And our safety tests, whatever they might be. More and more now we’re thinking about top-level organization and planning to achieve the outcomes we need. With a complex system, with a complex operating environment and a complex application.

Safety Planning

Now I’ll just mention planning. Okay, we need a safety management plan that defines the strategy: how we’re going to get there, how are we going to address safety. We need to document that safety management system for a specific project. Planning is very important for effective safety. Safety is very vulnerable to poor planning. If a project is badly planned or not planned at all, it becomes very difficult to Do safety effectively, because we are dependent on the process, on following a rigorous process to give us confidence that all results are correct.  If you’ve got a project that is a bit haphazard, that’s not going to help you achieve the objectives.

Planning is important. Now the bit of that safety plan that deals with timescales, milestones and other date-related information. We might refer to as a safety program. Now being a UK Definition, British English has two spellings of program. The double-m-e version of programme. Applies to that time-based progression, or milestone-based progression.

Whereas in the US and in Australia, for example, we don’t have those two words we just have the one word, ‘program’. Which Covers everything: computer programs, a programme of work that might have nothing to do with or might not be determined by timescales or milestones. Or one that is. But the point is that certain things may have to happen at certain points in time or before certain milestones. We may need to demonstrate safety before we are allowed to proceed to tests and trials or before we are allowed to put our system into service.

Demonstrating Safety

We’ve got to demonstrate that Safety has been achieved before we expose people to risk.  That’s very simple. Now, finally, we’re almost at the end. Now we need to provide a demonstration – maybe to a regulator, maybe to customers – that we have achieved safety.  This standard uses the concept of a safety case. The safety case is basically, imagine a portfolio full of evidence.  We’ve got a structured argument to put it all together. We’ve got a body of the evidence that supports the argument.

It provides a Compelling, Comprehensible (or understandable) and valid case that a system is safe. For a given application or use, in a given Operating environment.  Really, that definition of what a safety case is harks back to that meaning of safety.  We’ve got something that really hits the nail on the head. And we might put all of that together and summarise it in a safety case report. That summarises those arguments and evidence, and documents progress against the Safe program.

Remember I said our planning was important. We started off saying that we need to do this, that the other in order to achieve safety. Hopefully, in the end, in the safety report we’ll be able to state that we’ve done exactly that. We did do all those things. We did follow the process rigorously. We’ve got good results. We’ve got a robust safety argument. With evidence to support it. At the end, it’s all written up in a report.

Documenting Safety

Now that isn’t always going to be called a safety case report; it might be called a safety assessment report or a design justification report. There are lots of names for these things. But they all tend to do the same kind of thing, where they pull together the argument as to why the system is safe. The evidence to support the argument, document progress against a plan or some set of process requirements from a standard or a regulator or just good practice in an industry to say: Yes, we’ve done what we were expected to do.

The result is usually that’s what justifies [the system] getting past that milestone. Where the system is going into service and can be used. People can be exposed to those risks, but safely and under control.

Everyone’s a winner, as they say!

Copyright – Creative Commons Licence

Okay. I’ve used a lot of information from a UK government website. I’ve done that in accordance with the terms of its creative commons license, and you can see more about that here. We have we complied with that, as we are required to, and to say to you that the information we’ve supplied is under the terms of this license.

Safety Concepts Part 2: More Resources

And for more resources and for more lessons on system safety. And other safe topics. I invite you to visit the safety artisan.com website  Thanks very much for watching. I hope you found that useful.

We’ve covered a lot of information there, but hopefully in a structured way. We’ve repeated the key concepts and you can see that in that standard. The key concepts are consistently defined, and they reinforce each other. In order to get that systematic, disciplined approach to safety, that’s we need.

Anyway, that’s enough from me. I hope you enjoyed watching and found that useful. I look forward to talking to you again soon. Please send me some feedback about what you thought about this video and also what you would like to see covered in the future.

Thank you for visiting The Safety Artisan. I look forward to talking to you again soon. Goodbye.

Safety Concepts Part 1 defines the meaning of ‘Safe’, and it is free. Return to the Start Here Page.

Categories
Start Here System Safety

System Safety Principles

In this 45-minute video, I discuss System Safety Principles, as set out by the US Federal Aviation Authority in their System Safety Handbook. Although this was published in 2000, the principles still hold good (mostly) and are worth discussing. I comment on those topics where modern practice has moved on, and those jurisdictions where the US approach does not sit well.

This is the ten-minute preview of the full, 45-minute video.

System Safety Principles: Topics

  • Foundational statement
  • Planning
  • Management Authority
  • Safety Precedence
  • Safety Requirements
  • System Analyses Assumptions & Criteria
  • Emphasis & Results
  • MA Responsibilities
  • Software hazard analysis
  • An Effective System Safety Program

System Safety Principles: Transcript

Click here for the Transcript

Hello and welcome to The Safety Artisan where you will find professional pragmatic and impartial educational products. I’m Simon and it’s the 3rd of November 2019. Tonight I’m going to be looking at a short introduction to System Safety Principles.

Introduction

On to system safety principles; in the full video we look at all principles from the U.S. Federal Aviation Authority’s System Safety Handbook but in this little four- or five-minute video – whatever it turns out to be – we’ll take a quick look just to let you know what it’s about.

Topics for this Session

These are the subjects in the full session. Really a fundamental statement; we talk about planning; talk about the management authority (which is the body that is responsible for bringing into existence -in this case- some kind of aircraft or air traffic control system, something like that, something that the FAA would be the regulator for in the US). We talk about safety precedents. In other words, what’s the most effective safety control to use. Safety requirements; system analyses – which are highlighted because that’s just the sample I’m going to talk about, tonight; assumptions and safety criteria; emphasis and results – which is really about how much work you put in where and why; management authority responsibilities; a little aside of a specialist area – software hazard analysis; And finally, what you need for an effective System Safety Program.

Now, it’s worth mentioning that this is not an uncritical look at the FAA handbook. It is 19 years old now so the principles are still good, but some of it’s a bit long in the tooth. And there are some areas where, particularly on software, things have moved on. And there are some areas where the FAA approach to system safety is very much predicated on an American approach to how these things are done.  

Systems Analysis

So, without further ado, let’s talk about system analysis. There are two points that the Handbook makes. First of all, that these analyses are basic tools for systematically developing design specifications. Let’s unpack that statement. So, the analyses are tools- they’re just tools. You’ve still got to manage safety. You’ve still got to estimate risk and make decisions- that’s absolutely key. The system analyses are tools to help you do that. They won’t make decisions for you. They won’t exercise authority for you or manage things for you. They’re just tools.

Secondly, the whole point is to apply them systematically. So, coverage is important here- making sure that we’ve covered the entire system. And also doing things in a thorough and orderly fashion. That’s the systematic bit about it. And then finally, it’s about developing design specifications. Now, this is where the American emphasis comes in. But before we talk about that, it’s fundamental to note that really we need to work out what our safety requirements are. What are we trying to achieve here with safety? And why? And those are really important concepts because if you don’t know what you’re trying to achieve then it will be very difficult to get there and to demonstrate that you’ve got there- which is kind of the point of safety. And putting effort into getting the requirements right is very important because without doing that first step all your other work could be invalid. And in my experience of 20 plus years in the business, if you don’t have a really precise handle on what you’re trying to achieve then you’re going to waste a lot of time and money, probably.

So, onto the second bullet point. Now the handbook says that the ultimate measure of safety is not the scope of analysis but in satisfying requirements. So, the first part – very good. We’re not doing analysis for the sake of it. That’s not the measure of safety – that we’ve analyzed something to death or that we’ve expended vast amounts of dollars on doing this work but that we’ve worked out the requirements and the analysis has helped us to meet them. That is the key point.

This is where it can go slightly pear-shaped in that this emphasis on requirements (almost to the exclusion of anything else) is a very U.S.-centric way of doing things. So, very much in the US, the emphasis is you meet the spec, you certify that you’ve met spec and therefore we’re safe. But of course what if the spec is wrong? Or what if it’s just plain inappropriate for a new use of an existing system or whatever it might be?

In other jurisdictions, notably the U.K. (and as you can tell from my accent that’s where I’m from,  I’ve got a lot of experience doing safety work in the U.K. but also Australia where I now live and work) it’s not about meeting requirements. Well, it is but let me explain. In the UK and Australia, English law works on the idea of intent. So, we aim to make something safe: not whether it has that it’s necessarily met requirements or not, that doesn’t really matter so much, but is the risk actually reduced to an acceptable level? There are tests for deciding what is acceptable. Have you complied with the law? The law outside the US can take a very different approach to “it’s all about the specification”.

Of course, those legal requirements and that requirement to reduce risk to an acceptable level, are, in themselves, requirements. But in Australian or British legal jurisdiction, you need to think about those legal requirements as well. They must be part of your requirements set. So, just having a specification for a technical piece of cake that ignores the requirements of the law, which include not only design requirements but the thing is actually safe in service and can be safely introduced, used, disposed of, etc. If you don’t take those things into account you may not meet all your obligations under that system of law. So, there’s an important point to understanding and using American standards and an American approach to system safety out of the assumed context. And that’s true of all standards and all approaches but it’s a point I bring out in the main video quite forcefully because it’s very important to understand.

Copyright Statement

So, that’s the one subject I’m going to talk about in this short video. I’d just like to mention that all quotations are from the FAA system safety handbook which is copyright free but the content of this video presentation, including the added value from my 20 plus years of experience, is copyright of the Safety Artisan.

For More…

And wherever you’re seeing this video, be it on social media or whatever, you can see the full version of the video and all other videos at The Safety Artisan.

End

That’s the end of the show. It just remains to me to say thanks very much for giving me your time and I look forward to talking to you again soon. Bye-bye.

Back to the Start Here Page.