Categories
Blog Mil-Std-882E Risk Assessment Safety Analysis

Understanding Your Risk Assessment Standard

When Understanding Your Risk Assessment Standard, we need to know a few things. The standard is the thing that we’re going to use to achieve things – the tool. And that’s important because tools designed to do certain things usually perform well. But they don’t always perform well on other things. So we will ask ‘Are we doing the right thing?’ And ‘Are we doing it right?’

This post is part of a series:

Video Highlights

Understanding Your Standard: Highlights

Transcript

What and Why?

So, what will we do and why are we doing it? First, the use of safety standards is very common for many reasons. It helps us to have confidence that what we’re doing is good enough. We’ve met a standard of performance in the absolute sense. It helps us to say, ‘We’ve achieved standardization or commonality in what we’re doing’.

We can also use it to help us achieve a compromise. That can be a compromise across different stakeholders or different organizations. Standardization gives us some of the other benefits as well. If we’re all doing the same thing rather than we’re all doing different things, it makes it easier to train staff. This is one example of how a standard helps.

However, we need to understand this tool that we’re going to use. What it does, what it’s designed to do, and what it is not designed to do. That’s important for any standard or any tool. In safety, it’s particularly important because safety is in many respects an intangible. This is because we’re always looking to prevent a future problem from occurring. In the present, it’s a little bit abstract. It’s a bit intangible. So, we need to make sure that in concept what we’re doing makes sense and it’s coherent. That it works together. If we look at those five bullet points there, we need to understand the concept of each standard. We need to understand the basis of each one.

They’re not all based on the same concept. Thus, some of them are contradictory or incompatible. We need to understand the design of the standard. What the standard does, what the aim of the standard is, and why it came into existence. And who brought it into existence. To do what for who – who’s the ultimate customer here?

For risk analysis standards, we need to understand what kind of risks it addresses. Because the way you treat a financial risk might be very different from a safety risk. In the world of finance, you might have a portfolio of products, like loans. These products might have some risks associated with them. One or two loans might go bad and you might lose money on those. But as long as the whole portfolio is making money that might be acceptable to you. You might say, ‘I’m not worried about that 10% of my loans have gone south and all gone wrong. I’m still making plenty of profit out of the other 90%’. It doesn’t work that way with safety. You can’t say ‘It’s OK that I’ve killed a few people over here because all this a lot over here are still alive!’. It doesn’t work like that!

Also, what kind of evidence does the standard produce? Because in safety, we are very often working in a legal framework that requires us to do certain things. It requires us to achieve a certain level of safety and prove that we have done so. So, we need certain kinds of evidence. In different jurisdictions and different industries, some evidence is acceptable. Some are not. You need to know which is for your area. And then finally, let’s think about the pros and cons of the standard, what does it do well? And what does it do not so well?

System Safety Pedigree

We’re going to look at a standard called Military Standard 882E. This standard was first developed several decades ago. It was created by the US government and military to help them bring into service complex cutting-edge military equipment. Equipment that was always on the cutting edge. That pushes the limits of what you can achieve in performance.

That’s a lot of complexity. Lots of critical weapon systems, and so forth. So they needed something that could cope with all that complexity. It’s a system safety engineering standard. It’s used by engineers, but also by many other specialists. As I said, it’s got a background in military systems. These days you find these principles used pretty much everywhere. So, all the approaches to System Safety that 882 introduced are in other standards. They are also in other countries.

It addresses risks to people, equipment, and the environment, as we heard earlier. And because it’s an American standard, it’s about system safety. It’s very much about identifying requirements. What do we need to happen to get safety? To do that, it produces lots of requirements. It performs analyses of all those requirements and generates further requirements. And it produces requirements for test evidence. We then need to fulfill these requirements. It’s got several important advantages and disadvantages. We’re going to discuss these in the next few slides…

This is Module 3 of SSRAP

‘Understanding Your Risk Assessment Standard’ is Module 3 of the System Safety Risk Assessment Program (SSRAP) Course. Risk Analysis Programs – Design a System Safety Program for any system in any application.

The full course comprises 15 lessons and 1.5 hours of video content, plus resources. It’s on pre-sale at HALF PRICE until September 1st, 2024. Check out all the free preview videos here and order using the coupon “Pre-order-Half-Price-SSRAP”. But don’t leave it too long because there are only 100 half-price courses available!

Meet the Author

Learn safety engineering with me, an industry professional with 25 years of experience, I have:

•Worked on aircraft, ships, submarines, ATMS, trains, and software;

•Tiny programs to some of the biggest (Eurofighter, Future Submarine);

•In the UK and Australia, on US and European programs;

•Taught safety to hundreds of people in the classroom, and thousands online;

•Presented on safety topics at several international conferences.

Categories
Blog Safety Analysis

Hazard and Risk Basics

What are the Hazard and Risk basics? So, what is this risk analysis stuff all about? What is ‘risk’? How do you define or describe it? How do you measure it? When? Why? Who…?

In this free session, I explain the basic terms and show how they link together, and how we can break them down to perform risk analysis. I understand hazards and risks because I’ve been analyzing them for a long time. Moreover, I’ve done this for aircraft, ships, submarines, sensors, command-and-control systems, and lots of software!

Everyone does it slightly differently, but my 25+ years of diverse experience lets me focus on the basics. That allows me to explain it in simple terms. I’ve unpacked the jargon and focus on what’s important.  

This post is part of a series:

    Recap: Risk Basics

    Topics: Hazard and Risk Basics

    • Risk & Mishap;
    • Probability & Severity;
    • Hazard & Causal Factor;
    • Mishap (accident) sequence; and
    • Hazards: Tests & Example

    Transcript: Hazard and Risk Basics

    Let’s get started with Module One. We’re going to recap some Risk basics to make sure that we have a common understanding of risk. And that’s important because risk analysis is something that we do every day. Every time you cross the road, or you buy something expensive, or you decide whether you’re going to travel to something, or look it up online, instead.

    You’re making risk analysis decisions all the time without even realizing it. But we need something a little bit more formal than the instinctive thinking of our risk that we do all the time. And to help us do that, we need a couple of definitions to get us started.

    What is Risk?

    First of all, what is Risk? It’s a combination of two things. First, the severity of a mishap or accident. Second, the probability that that mishap will occur. So it’s a combination of severity and probability. We will see that illustrated in the next slide.

    We’ll begin by talking about ‘mishap’. Well, what is a mishap? A mishap is an event – or a series of events -resulting in unintentional harm. This harm could be death, injury, occupational illness, damage to or loss of equipment or property, or damage to the environment.

    The particular standard we’re looking at today covers a range of different harms. That’s why we’re focused on safety. And the term ‘mishap’ will also include negative environmental impacts from planned events. So, even if the cause is a deliberate event, we will include that as a mishap.

    Probability and Severity

    I said that the definition of risk was a combination of probability and severity. Here we got a little illustration of that…

    This is Module 1 of SSRAP

    This is Module 1 from the System Safety Risk Assessment Program (SSRAP) Course. Risk Analysis Programs – Design a System Safety Program for any system in any application.

    The full course comprises 15 lessons and 1.5 hours of video content, plus resources. It’s on pre-sale at HALF PRICE until September 1st, 2024. Check out all the free preview videos here and order using the coupon “Pre-order-Half-Price-SSRAP”. But don’t leave it too long because there are only 100 half-price courses available!

    Meet the Author

    Learn safety engineering with me, an industry professional with 25 years of experience, I have:

    •Worked on aircraft, ships, submarines, ATMS, trains, and software;

    •Tiny programs to some of the biggest (Eurofighter, Future Submarine);

    •In the UK and Australia, on US and European programs;

    •Taught safety to hundreds of people in the classroom, and thousands online;

    •Presented on safety topics at several international conferences.

    Categories
    Mil-Std-882E Safety Analysis

    System of Systems Hazard Analysis

    In this full-length (38-minute) session, The Safety Artisan looks at System of Systems Hazard Analysis, or SoSHA, which is Task 209 in Mil-Std-882E. SoSHA analyses collections of systems, which are often put together to create a new capability, which is enabled by human brokering between the different systems. We explore the aim, description, and contracting requirements of this Task, and an extended example to illustrate SoSHA. (We refer to other lessons for special techniques for Human Factors analysis.)

    This is the seven-minute demo version of the full 38-minute video.

    System of Systems Hazard Analysis: Topics

    • System of Systems (SoS) HA Purpose;
    • Task Description (2 slides);
    • Documentation (2 slides);
    • Contracting (2 slides);
    • Example (7 slides); and
    • Summary.

    Transcript: System of Systems Hazard Analysis

    Introduction

    Hello everyone and welcome to the Safety Artisan. I’m Simon and today we’re going to be talking about System of Systems Hazard Analysis – a bit of a mouthful that. What does it actually mean? Well, we shall see.

    System of Systems Hazard Analysis

    So, for Systems of Systems Hazard Analysis, we’re using task 209 as the description of what to do taken from a military standard, 882E. But to be honest, it doesn’t really matter whether you’re doing a military system or a civil system, whatever it might be – if you’ve got a system of systems, then this will help you to do it.

    Topics for this Session

    So, we look at the purpose of system of systems. By the way, if you’re wondering what that is what I’m talking about is when we take different things that we’ve developed elsewhere, e.g. platforms, electronic systems, whatever it might be, and we put them together. Usually, with humans gluing the system together somewhere, it must be said, to make it all tick and fit together.

    Then we want this collection of systems to do something new, to give us some new capability, which we didn’t have before. So, that’s what I’m talking about when I say system of systems. I’ll show you an example – it’s the best way.

    We’ve got a couple of slides on task description, a couple of slides or documentation, and a couple of slides on contracting. Task 209 has a very short task description, and therefore I’ve decided to go through an example. So, we’ve got seven slides of an example of a system of systems, safety case, and safety case report that I wrote. Hopefully, that will illustrate far better than just reading out the description. And that will also give us some issues that can emerge with systems of systems and I’ll summarize those at the end.

    SOSHA Purpose

    So, let’s get on. I’m going to call it the SOSHA for short; Systems of Systems Hazard Analysis. The purpose of the SOSHA, task 209, is to document or perform and document the analysis of the system of systems and identify unique system of systems hazards. So, things we don’t get from each system in isolation. This task is going to produce special requirements to deal with these hazards, which otherwise would not exist. Until we put the things together and start using them for something new – We’ve not done this before…

    see the full transcript here.

    End: System of Systems Hazard Analysis

    So, that is the end of the presentation and it just remains for me to say thanks very much for watching and listening. It’s been good to spend some time with you and I look forward to talking to you next time about environmental analysis, which is Task 210 in the military standard … until then, goodbye.

    Meet the Author

    Learn safety engineering with me, an industry professional with 25 years of experience, I have:

    •Worked on aircraft, ships, submarines, ATMS, trains, and software;

    •Tiny programs to some of the biggest (Eurofighter, Future Submarine);

    •In the UK and Australia, on US and European programs;

    •Taught safety to hundreds of people in the classroom, and thousands online;

    •Presented on safety topics at several international conferences.

    Categories
    Mil-Std-882E Safety Analysis

    Health Hazard Analysis

    In this full-length (55-minute) session, The Safety Artisan looks at Health Hazard Analysis, or HHA, which is Task 207 in Mil-Std-882E. I explore the aim, description, and contracting requirements of this complex Task. It covers: physical, chemical & biological hazards; Hazardous Materials (HAZMAT); ergonomics, aka Human Factors; the Operational Environment; and non/ionizing radiation. I will outline how to implement Task 207 in compliance with Australian WHS. (See also other lessons for specific tools and techniques, such as Human Factors analysis methods.)

    This is the seven-minute-long demo. The full version is a 55-minute-long whopper!

    Health Hazard Analysis: Topics

    • Task 207 Purpose;
    • Task Description;
    • ‘A Health Hazard is…’;
    • ‘HHA Shall provide Information…’;
    • HAZMAT;
    • Ergonomics;
    • Operating Environment;
    • Radiation; and
    • Commentary.

    Health Hazard Analysis: Transcript

    Introduction

    Hello, everyone, and welcome to the Safety Artisan. I’m Simon, your host, and today we are talking about health hazard analysis.

    Task 207: Health Hazard Analysis

    This is Task 207 in the Mil. standard, 882E approach, which is targeted for defense systems, but you will see it used elsewhere. The principles that we’re going to talk about today are widely applicable. So, you could use this standard for other things if you wish.

    Topics for this Session

    We’ve got a big session today so I’m going to plough straight on. We’re going to cover the purpose of the task; and the description; the task helpfully defines what a health hazard is; and says what health hazard analysis, or HHA, shall provide in terms of information. We talk about three specialist subjects – hazardous materials or hazmat, ergonomics, and operating environment. Also, radiation is covered, as another specialist area. Then we’ll have some commentary from myself.

    Now the requirements of the standard of this task are so extensive that for the first time, I won’t be quoting all of them, word for word. I’ve actually had to chop out some material, but I’ll explain that when we come to it. We can work with that but it is quite a demanding task, as we’ll see.

    Task Purpose

    Let’s look at the task purpose. We are to perform and document a health hazard analysis to identify human health hazards and evaluate what it says, materials and processes using materials, etc, that might cause harm to people, and to propose measures to eliminate the hazards or reduce the associated risks. In many respects, it’s a standard 882-type approach. We’re going to do all the usual things. However, as we shall see it, we’re going to do quite a lot more on this one.

    Task Description #1

    So, task description. We need to evaluate the potential effects resulting from exposure to hazards, and this is something I will come back to again and again. It’s very easy dealing in this area, particularly with hazardous materials, to get hung up on every little tiny amount of potentially hazardous material that is in the system or in a particular environment and I’ve seen this done to death so many times. I’ve seen it overdone in the UK when COSHH, a control of substance hazardous to health, came in in the military. We went bonkers about this. We did risk assessments up the yin-yang for stuff that we just did not need to worry about. Stuff that was in every office up and down the land. So, we need to be sensible about doing this, and I’ll keep coming back to that.

    So, we need to do as it says; identification assessment, characterization, control, and communicate assets in the workplace environment. We need to follow a systems approach, considering “What’s the total impact of all these potential stressors on the human operator or maintainer?” Again, I come from a maintenance background. The operator often gets lots of attention because a) because if the operator stuffs up, you very often end up with a very nasty accident where lots of people get hurt. So, that’s a legitimate focus for a human operator of a system.

    But also, a lot of organizations, the executive management tend to be operators because that’s how the organization evolves. So, sometimes you can have an emphasis on operations and maintenance and support, and other things get ignored because they’re not sexy enough to the senior management. That’s a bad reason for not looking at stuff. We need to think about the big picture, not just the people who are in control…

    get the full transcript here.

    End: Health Hazard Analysis

    So, that is the end of the session. Thank you very much for listening. And all that remains for me to say is thanks very much for supporting the work of the Safety Artisan and tuning into this video. And I wish you every success in your work now and in the future. Goodbye.

    Meet the Author

    Learn safety engineering with me, an industry professional with 25 years of experience, I have:

    •Worked on aircraft, ships, submarines, ATMS, trains, and software;

    •Tiny programs to some of the biggest (Eurofighter, Future Submarine);

    •In the UK and Australia, on US and European programs;

    •Taught safety to hundreds of people in the classroom, and thousands online;

    •Presented on safety topics at several international conferences.

    Categories
    Blog Safety Analysis

    Preliminary Hazard Identification & Analysis Guide: Free

    Get the Preliminary Hazard Identification & Analysis Guide for free! It’s a 50-page .pdf download, collated from reliable sources.

    Contents:

    • Introduction …………………………….1
    • Aim …………………………………………1
    • Description ………………………………2
    • Method ……………………………………3
    • Guidance …………………………………4
    • Inspect the Workplace ………………7
    • How to find hazards …………………8
    • Review available information ……8
    • Consult Your Workers ……………..10
    • When to Consult with Workers …10
    • Hazard Checklists ……………………12
    • Functional Safety Analysis ……….16
    • FMEA/FMECA ……………………….21
    • SWIFT …………………………………..28
    • HAZOP ………………………………….42
    Front cover of PHIA Guide
    The Safety Artisan’s PHIA Guide

    Preliminary Hazard Identification & Analysis Guide – Introduction

    Hazard Identification has been defined as: “The process of identifying and listing the hazards and accidents associated with a system.”

    Hazard Analysis has been defined as: “The process of describing in detail the hazards and accidents associated with a system and defining accident sequences.”

    Preliminary Hazard Identification and Analysis (PHIA) is intended to help you determine the scope of the safety activities and requirements. It identifies the main hazards likely to arise from the capability and functionality being provided. It is carried out as early as possible in the project life cycle, providing an important early input to setting Safety requirements and refining the Project Safety Plan.

    PHIA seeks to answer, at an early stage of the project, the question: “What Hazards and Accidents might affect this system and how could they happen?”

    Aim

    The aim of the PHIA is to identify, as early as possible, the main Hazards and Accidents that may arise during the life of the system. It provides input to:

    1. Scoping the subsequent Safety activities required in any Safety Plan. A successful PHIA will help to gauge the proportionate effort that is likely to be required to produce an effective Safety Case, proportionate to risks.
    2. Selecting or eliminating options for subsequent assessment.
    3. Setting the initial Safety requirements and criteria.
    4. Subsequent Hazard Analyses.
    5. Initiate Hazard Log.

    Did You Know?

    You can also get the Guide with the PHIA Triple Lesson Bundle.

    Preliminary Hazard Identification & Analysis Guide: Free

    Categories
    Mil-Std-882E Safety Analysis

    Operating & Support Hazard Analysis

    In this full-length session, I look at Operating & Support Hazard Analysis, or O&SHA, which is Task 206 in Mil-Std-882E. I explore Task 206’s aim, description, scope, and contracting requirements.

    There’s value-adding commentary, which explains O&SHA: how to use it with other tasks; how to apply it effectively on different products; and some of the pitfalls to avoid. This is based on my 25 years in system safety and my background in operations and maintenance.

    I also refer to other lessons for specific tools and techniques, such as Human Factors analysis methods.

    This is the seven-minute-long demo. The full version is about 35 minutes long.

    Operating & Support Hazard Analysis: Topics

    • Task 206 Purpose:
      • To identify and assess hazards introduced by O&S activities and procedures;
      • To evaluate the adequacy of O&S procedures, facilities, processes, and equipment used to mitigate risks associated with identified hazards.
    • Task Description (six slides);
    • Reporting (two slides);
    • Contracting (two slides); and
    • Commentary (four slides).

    Operating & Support Hazard Analysis: Transcript

    Introduction

    Hello everyone and welcome to the Safety Artisan; home of safety engineering training. I’m Simon and today we’re going to be carrying on with our series on Mil. Standard 882E system safety engineering.

    Operating & Support Hazard Analysis

    Today, we’re going to be moving on to the subject of operating and support hazard analysis. This is, as it says, task 206 under the standard. Operating and support hazard analysis, I’ll just call it O&S or OSHA (also O&SHA) for short. Unfortunately, that will confuse people if I call OSHA. Let’s call it O&S.

    Topics for this Session

    The purpose of O&S hazard analysis is to identify and assess hazards introduced by those activities and procedures and to evaluate the adequacy of O&S procedures, processes, equipment, facilities, etc, to mitigate risks that have been already identified. A twofold task but a very big task. And as we’ll see, we’ve got lots of slides today on task description, and reporting, contracting, and commentary. As always, I present the full text as is of the task, which is copyright free, but I’m only going to talk about the things that are important. So, we’re not going to go through every little clause of the standard that would be pointless.

    O&S Hazard Analysis (T206)

    Let’s get started with the purpose. As we’ve already said, it’s to identify and assess those hazards which are introduced by operational and support activities and procedures and evaluate their adequacy. So, we’re looking at operating the system, whatever it may be- And of course, this is a military standard, so we assume a military system, but not all military systems are weapon systems by any means. Not all are physical systems.

    There may be inventory management systems, management information systems, all kinds of stuff. So, does operating those systems and just supporting them, maintaining them are resupplying them, disposing of them, etc – Does that create any hazards or introduce any hazards? And how do we mitigate? That’s the purpose of the task.

    Task Description (T206) #1

    Let’s move on to the task description. Again, we’re assuming a contractor is performing the analysis, but that’s not necessarily the case. For this task, this actually says this typically begins during engineering and manufacturing development, or EMD.  So, we’re assuming an American style lifecycle for a big system and EMD comes after concept and requirements development. So, we are beginning to move into the very expensive stage of development for a system where we begin to commit serious money.

    It’s suggesting that O&SHA can wait until then which is fine in general unless you’ve identified any particularly novel hazards that will need to be dealt with earlier on. As it says, it should build on design hazard analyses, but we’ll also talk about the case later on when there is no design hazard analyses. And the O&SHA shall identify requirements or alternatives or eliminating hazards, mitigating risks, etc. This is one of those tasks where the human is very important – In fact, dominant to be honest. Both as a source of hazards and the potential victim of the associated risks. A lot of human-centric stuff going on here.

    Task Description (T206) #2

    As always, we’re going to think about the system configurations. We’re going to think about what we’re going to do with the system and the environment that we’re going to do it in. So, a familiar triad and I know I keep banging on about this, but this really is fundamental to bounding and therefore evaluating safety. We’ve got to know what the system is, what we’re doing with it, and the environment in which we’re doing it. Let’s move on…

    Click here to see the full transcript.

    End: Operating & Support Hazard Analysis

    So, that is the end of the lesson and it just remains for me to say thank you very much for your time and for listening. And I look forward to seeing you again soon. Cheers.

    Meet the Author

    Learn safety engineering with me, an industry professional with 25 years of experience, I have:

    •Worked on aircraft, ships, submarines, ATMS, trains, and software;

    •Tiny programs to some of the biggest (Eurofighter, Future Submarine);

    •In the UK and Australia, on US and European programs;

    •Taught safety to hundreds of people in the classroom, and thousands online;

    •Presented on safety topics at several international conferences.

    Categories
    Mil-Std-882E Safety Analysis System Safety

    System Requirements Hazard Analysis

    In this 45-minute session, I’m looking at System Requirements Hazard Analysis, or SRHA, which is Task 203 in the Mil-Std-882E standard. I will explore Task 203’s aim, description, scope, and contracting requirements.  SRHA is an important and complex task, which must be done on several levels to succeed.  This video explains the issues and discusses how to perform SRHA well.

    This is the seven-minute demo video, the full version is 40 minutes’ long.

    Topics: System Requirements Hazard Analysis

    • Task 202 Purpose;
    • Task Description:
      • Determine Requirements;
      • Incorporate Requirements; and
      • Assess the compliance of the System.
    • Contracting;
    • Section 4.2 (of the standard); and
    • Commentary.

    Transcript

    Introduction

    Hello and welcome to the Safety Artisan, where you will find professional, pragmatic and impartial advice on all things system, safety and related.

    System Requirements Hazard Analysis

    Today, we’re talking about system requirements hazard analysis. And this is part of our series on Mil. Standard 882E, and this one is Task 203. And it’s a very widely used system safety engineering standard. Its influence is found in many places, not just in military procurement programs.

    Topics for this Session

    We’re looking at this task, which is very important, possibly the most important task of all, as we’ll see. I’m talking about the purpose of the task, which is word-for-word from the task description itself.

    We’re talking about in the task description, the three aims of this task, which is to determine or work out requirements, incorporate them, and then assess the compliance of the system with those requirements, because, of course, it may not be a simple read-across. We’ve got six slides on that. That’s most of the task.

    Then we’ve just got one slide on contracting, which if you’ve seen any of the others in this series, will seem very familiar. We’ve got a bit of a chat about Section 4.2 from the standard and some commentary, and the reason for that will become clear. Let’s crack on!

    System Requirements Hazard Analysis

    Task 203.1, the purpose of Task 203 is to perform and document a System Requirements Hazard Analysis or SRHA. And as we’ve already said, the purpose of this is to determine the design requirements. We’re going to focus on design rather than buying stuff off the shelf – we’ll talk about the implications of that a little bit later.

    Design requirements to eliminate or reduce hazards and risks, incorporate those requirements, into a says, into the documentation, but what it should say is incorporate risk reduction measures into the system itself and then document it.

    Finally, to assess compliance of the system with these requirements. Then it says the SRHA address addresses all life-cycle phases, so not just meant for you to think about certain phases of the program. What are the requirements through life for the system? And in all modes. Whether it’s in operation, whether it’s in maintenance or refit, whether it’s being repaired or disposed of, whatever it might be.

    Task Description #1

    The first of six slides is the task description. I’m using more than one colour because there’s some quite a lot of important points packed quite tightly together in this description.

    We’re assuming that the contractor performs and documents this SRHA. The customer needs to do a lot of work here before ever gets near a contractor. More on that later. We need to determine system design requirements to eliminate hazards or reduce associated risks.

    Two things here. By identifying applicable policies, regulations, standards, etc. More on that later. And analyzing identified hazards. So, requirements to perform the analysis as well as to simply just state ‘We want a system to do this and not to do that’. So, we need to put some requirements to say ‘Here’s what we want analyzed maybe to what degree? And why.’ is always helpful.

    Task Description #2

    Breaking those breaking those two requirements down.

    Part a. We identify applicable requirements by reviewing our military and industry standards and specs, and historical documentation of systems that are similar or with a system that we’re replacing, perhaps. It’s assumed that the US Department of Defense is the customer, the ultimate customer. So, the ultimate customer’s requirements, including whatever they’ve said about standard ways of mitigating certain common risks.

    The system performance spec, that’s your functional performance spec or whatever you want to call it. Other system design requirements and documents – a bit of a catchall there. And applicable federal, military, state, and local regulations.

    This is a US standard. It’s a federated state, much like Australia and lots of modern states, even the UK. There are variations in law across England, Wales, Scotland and Ireland. They’re not great, but they do exist.

    And in the US and Australia, those differences are greater. And it says applicable executive orders. Executive orders, they’re not law, but they are what the executive arm of the U.S. government has issued, and international agreements. There are a lot of words in there – have a look at the different statements that are in white, blue, and yellow.

    Basically, from international agreements right down to whatever requirements may be applicable, they all need to be looked at and accounted for. So, there’s a huge amount of work there for someone to do. I’ll come back to who that someone should be later.

    End: System Requirements Hazard Analysis

    You can find a free pdf of the System Safety Engineering Standard, Mil-Std-882E, here.

    Meet the Author

    Learn safety engineering with me, an industry professional with 25 years of experience, I have:

    •Worked on aircraft, ships, submarines, ATMS, trains, and software;

    •Tiny programs to some of the biggest (Eurofighter, Future Submarine);

    •In the UK and Australia, on US and European programs;

    •Taught safety to hundreds of people in the classroom, and thousands online;

    •Presented on safety topics at several international conferences.

    Categories
    Blog Safety Analysis

    Failure Mode Effects Analysis

    TL;DR This article on Failure Mode Effects Analysis explains this powerful and commonly used family of techniques. You can access this webinar (and all the others) here.

    I have used FMEA and related techniques on many programs and it can produce powerful results quickly and cheaply. Recently, I’ve seen some criticism of FEMA on social media. However, I’m convinced that this is only clickbait. The secret of success is to understand what a technique is good for – and not – and to apply it well. It’s as simple as that!

    This article covers:

    • A description of the technique, including its purpose;
    • When it might be used;
    • Advantages, disadvantages and limitations;
    • Sources of additional information;
    • A simple example of an FMEA/FMECA; and
    • Additional comments.

    I’ve added some ‘top tips’ of my own based on my personal experience in the industry.

    Top Tip

    In this article, I have used material from a UK Ministry of Defence guide, reproduced under the terms of the UK’s Open Government Licence.

    A Description of the Technique, Including Its Purpose

    Failure modes and effects analysis (FMEA) was one of the first systematic techniques for failure analysis. It was developed in the United States military (Military Procedure MIL-P-1629, titled ‘Procedures for Performing a Failure Modes, Effects and Criticality Analysis’, November 9, 1949) as a reliability evaluation technique to determine the effect of system and equipment failures. Failures were classified according to their impact on mission success and personnel, equipment, and safety. In the 1960’s it was used by the aerospace industry and NASA during the Apollo program. More and more industries – notably the automotive industry – have seen the benefits to be gained by using FMEAs to complement their design processes.

    This qualitative technique helps identify failure potential in a design or process i.e. to foresee failure before it actually happens. This is done by defining the system that is under consideration to ensure system boundaries are established and then by following a procedure, which helps to identify design features or process operations that could fail. The procedure requires the following essential questions to be asked:

    • How can each component fail?
    • What might cause these modes of failure?
    • What could the effects be if these failures did occur?
    • How serious are these failure modes?
    • How is each failure mode detected?
    • What are the safeguards in place to protect against accidents resulting from the failure mode?

    As always with safety analyses, the more precisely you can answer these questions (above), the better the results you will get.

    Top Tip

    As an aid in structuring the analysis and ensuring a systematic approach, results are recorded in a tabular format. Several different forms are in use, and the form design can be tailor-made to suit the particular requirements of a study. Examples of forms can be found in several standards (links below).

    Make the form support the flow of the process, left-to-right, then top-down!

    Top Tip

    The FMEA analysis can be extended if necessary by characterizing the likelihood, severity, and resulting levels of risk of failures. FMEAs that incorporate this criticality analysis (CA) are known as FMECAs. A FMECA is an analytical quantitative technique, which ranks failure modes according to their probability and consequences (i.e. the resulting effect of the failure mode on the system, mission, and personnel). It is referred to as a “bottom-up approach” as it starts by identifying the potential failure modes of a component and analyzing their effects on the whole system. It can be quite complex depending on how the user drives the technique.

    It is important to note that the FMECA does not provide a model by which system reliability can be quantified. Hence, if the objective is to estimate the probability of events, a technique that results in a logic model of the failure mechanisms must be employed, typically a fault tree and/or an event tree.

    Reliability Block Diagrams, or for repairable systems, Markov Chains can also be used.

    Top Tip

    A FMEA or FMECA can be conducted on either a component or a functional level. A functional FMEA/FMECA only covers hardware aspects but a functional FMEA/FMECA can cover all aspects of a system. For either approach, the general principle remains the same.

    When it Might be Used

    FMEA is applicable for any well-defined system but is primarily used for reviews of mechanical and electrical systems. It can be used in many situations, for example, to assess the design of a product in terms of what could go wrong in manufacturing and in-service as a result of the weakness in the design. It can also be used to analyze failures in the manufacturing process itself and during service. It is effective for collecting information needed to troubleshoot system problems and improving maintenance and reliability of plant and equipment (defining and optimizing) as it focuses directly and individually on equipment failure modes.

    It’s fair to say that you need a design, on which to perform a FMEA. Pre-design you could use Functional Failure Analysis (FFA) instead.

    Top Tip

    The FMECA technique is best suited for detailed analysis of system hardware, and should preferably be carried out by the designer in parallel with system development. This will not only speed up the analysis itself, but also force the design team to think systematically about the failure characteristics of the system. The primary use of the FMECA is in verifying that single component failures cannot cause a catastrophic system failure.

    There are a number of areas today in which the use of FMECA has become mandatory to demonstrate system reliability. Examples of such requirements are in the classification of Dynamically Positioned (DP) vessels and in a number of US military applications for which MIL-STD documents apply.

    Advantages, Disadvantages, and Limitations

    Advantages

    • It is widely-used and well-understood, and easy to understand and interpret
    • It can be performed by a single analyst, or more if required
    • Qualitative data about the causes and effects can be incorporated into the analysis
    • It is systematic and comprehensive, and should identify hazards with an electrical or mechanical basis
    • The level of detail incorporated can be varied to suit the analysis
    • It identifies safety-critical equipment where a single failure would be critical for the system
    • Even though the technique can be quite time consuming it can lead to a thorough understanding of the system being considered

    Disadvantages

    • The technique adopts a bottom-up approach and if conducting a component level FMEA or FMECA this can be boring and repetitive
    • The benefit gained is dependent upon the experience of the analyst or the group.
    • It requires a hierarchical system drawing as the basis for the analysis, which the analyst usually has to develop before the FMEA process can start
    • It is optimised for mechanical and electrical equipment, and does not apply easily to Human Factor Integration, procedures or process equipment
    • It is difficult for the technique to cover multiple failures as equipment failures are generally analysed one by one therefore important combinations of equipment failures may be overlooked
    • Most accidents have a significant human or external influence contribution and these are not a usual failure mode with FMEA
    • More than one FMEA may be required for a system with multiple modes of operation
    • Due to its wide use there can be temptation to read across data from ARM or ILS projects where, for example, the fault-tree technique has been used. As a consequence, the safety perspective can be lost as human error has been excluded and the focus has been solely on determining faults and on not on more far-reaching safety issues
    • Perhaps the worst drawback of the technique is that all component failures are examined and documented, including those, which do not have any significant consequences.
    • For large systems, especially those with a fair degree of redundancy built into them, the amount of unnecessary documentation is a major disadvantage. Hence, the FMECA should primarily be used by designers of reasonably simple systems. It should however be noted that the concept of the FMECA form can be quite useful in other contexts, e.g. when reviewing an operation rather than a hardware system. Then the use of a form similar to the FMECA can provide a useful way of documenting the analysis. Suitable columns in the form could for example include; operation, deviation, consequence, correcting or reversing action, etc.

    ARM = Availability, Reliability, Maintainability
    ILS = Integrated Logistic Support (or logistics engineering
    )

    Top Tip

    Sources of Additional Information, such as Standards, Textbooks and Websites

    BS 5760: Part 5 Reliability of Systems, Equipment and Components: Part 5 Guide to Failure Modes, Effects and Criticality Analysis.

    HSE Website – Marine Risk Assessment, Offshore Technology Report 2001/063

    IEC 60812:2018 Failure modes and effects analysis (FMEA and FMECA)

    As always, Understand your Standard (what it was designed to do) to get the best out of it!

    Top Tip

    A Simple Example of an FMEA/FMECA

    An example extract from an FMEA of a ballast system is shown below. This can be found in the HSE Marine Risk Assessment Report. The column headings are based on the US Military Standard Mil-Std 1629A, but with modifications to suit the particular application. For example, the failure mode and cause columns are combined. The criticality of each failure is ranked as minor, incipient, degraded, or critical.

    An example of a FMEA Output Table

    To properly understand these results you need to know how a Sea Chest works (see context here). Otherwise the example just shows what kind of output a FMEA can produce.

    Top Tip

    Additional comments

    Failure Modes and Effects and Criticality Analysis (FMECA) is an analytical QRA technique, used by ARM and ILS systems engineers, most commonly and effectively at the late design, test and manufacture stage of a project. It requires the breakdown of the system into individual components and the identification of possible failure modes or malfunctions of each component, (such as too much flow through a valve). Referred to as a bottom-up approach, it starts by identifying the potential failure modes of a component and analyzing their potential effects on the whole system. Numerical levels can be assigned to the likelihood of the failure and the severity or consequence of the failure.

    Note: It is important to recognize that FMEA/FMECA Standards have different approaches to criticality. Failure mode severity classes 1 – 5 for Standards MIL1629A and ARP926A go from Class 1 being the most severe (e.g. loss of life) to Class 5 being less severe (i.e. no effect), whereas BS 5760 deals with criticality in the opposite direction where Class 5 is the most severe.

    Note that FMECA for ARM/ILS looks at availability or mission criticality, not safety criticality.  A FMECA for safety will have a different focus.

    Top Tip

    Software:

    • Isograph;
    • Reliability Work Bench;
    • Reliasoft;
    • Microsoft Excel.

    These are not recommendations!

    FMEA/FMECA tables for complex systems can run to hundreds of pages, so good tool support is essential.

    Top Tip

    Failure Mode Effects Analysis: Have You Used this Technique?

    Back to the Safety Assessment topic page.

    Meet the Author

    Learn safety engineering with me, an industry professional with 25 years of experience, I have:

    •Worked on aircraft, ships, submarines, ATMS, trains, and software;

    •Tiny programs to some of the biggest (Eurofighter, Future Submarine);

    •In the UK and Australia, on US and European programs;

    •Taught safety to hundreds of people in the classroom, and thousands online;

    •Presented on safety topics at several international conferences.

    Categories
    Blog Safety Analysis Tools & Techniques

    Five Ways to Identify Hazards

    In my webinar ‘Five Ways to Identify Hazards’ I look at a mix of techniques. We need these diverse techniques to assure us (give justified confidence) that we have identified the full range of hazards associated with a system.

    To do this I draw on my 25 years of experience (see ‘Meet the Author‘, below) and relevant standards. Here’s the introduction to the webinar.

    Five Ways to Identify Hazards: Video Introduction

    Webinar: ‘Five Ways to Identify Hazards’

    Four Things to Remember

    For hazard identification, we need to be aware of four things.

    What we’re doing is we are imagining what could go wrong. And I want to emphasize, first of all, imagination. We need to be open to what could happen. That’s the mindset that we need, and we’re looking at what could go wrong, not what will go wrong. Think about possibilities, not certainties.

    The second thing is that it’s very easy to dive down a rabbit hole and get into mega detail about one particular thing and spend lots of time, waste lots of time doing that. That’s not what we need to be doing. We need a broad approach. We need to go wide and think about as many different possible hazards as we can. Don’t dive deep that will come later, the deep analysis will come later.

    Another aspect of that point is we’re talking about hazard identification. We’re just here to identify hazards. We’re not here to try to assess them yet.

    Yet another mistake that people make is to try and jump straight to fixing the hazard. Many of us watching will be engineers. We love fixing problems. We like to solve problems, but we’re not here to solve the problem yet. We’re only here to identify it. So we’re going to avoid the temptation to jump in and try and come up with a solution. That’s not what we’re doing with hazard identification.

    So those are four things to bear in mind.

    Five Ways to Identify Hazards

    Let’s move on. So I’ve said that this was entitled five ways to identify hazards.

    There are, of course, many ways to identify hazards, but I just thought I’d pick on these five because there was a nice broad range of things and things that I can show you how to do straight away.

    Those are the five things that we’ve got and we’ll have a slide on each one of those. First, we can ask the workers or end users or their representatives. Secondly, we can inspect the workplace, we can look around for hazards. And maybe we’ve got a real workplace that we can look at or maybe we’ve just got a representation, we can do both.

    We can use a hazard identification checklist, we can survey historical data. So all the squiggly lines at the bottom of the screen, there’s an example of some historical data and we can conduct a number of analyses on that.

    But the analysis I picked on (Number 5) is Functional Failure Analysis and we’ll see why in just a moment. So those are the five things that we will cover in the next hour. We’ll also have time for a Question and Answer session and then a worked example of how to do a simple Functional Failure Analysis…

    There’s More!

    This is just one of many webinars in my Safety Engineering Academy. You can see summaries of them all in this blog post.

    Meet the Author

    Learn safety engineering with me, an industry professional with 25 years of experience, I have:

    •Worked on aircraft, ships, submarines, ATMS, trains, and software;

    •Tiny programs to some of the biggest (Eurofighter, Future Submarine);

    •In the UK and Australia, on US and European programs;

    •Taught safety to hundreds of people in the classroom, and thousands online;

    •Presented on safety topics at several international conferences.

    Categories
    Safety Analysis Tools & Techniques

    Exploring Causal Analysis: Techniques and Insights

    In this post, ‘Exploring Causal Analysis: Techniques and Insights’, I provide a quick summary of my recent webinar. You can see a short video introduction below, or access the full webinar at my Safety Engineering Academy.

    Introduction:

    Causal analysis is a vital aspect of system safety engineering, offering insights into the root causes of issues and guiding effective problem-solving strategies. In this webinar, we delve into various causal analysis techniques and discuss their practical applications in diverse domains.

    Section 1: Introduction to Causal Analysis

    Causal analysis involves understanding the sequence of events leading to an outcome and identifying the underlying factors contributing to it. We explore the fundamentals of causal analysis and its significance in safety engineering.

    Section 2: Popular Causal Analysis Techniques

    We examine eight popular causal analysis methods, including Pareto charts, Failure Mode and Effects Analysis (FMEA), 5 Whys, Ishikawa diagrams, Fault Tree Analysis (FTA), 8D reporting, DMAIC, and Scatter Diagrams. Each technique is analyzed for its strengths, limitations, and practical utility.

    Section 3: Deeper Dive into Selected Techniques

    We take a closer look at selected causal analysis techniques, exploring their application in real-world scenarios. Examples include using Pareto charts to identify dominant causes, leveraging FMEA for failure mode analysis, and utilizing Fault Tree Analysis for assessing complex system failures.

    Section 4: Insights and Reflections

    Drawing from years of experience in system safety engineering across diverse domains and international contexts, we share insights and reflections on the effectiveness of different causal analysis techniques. We emphasize the importance of choosing the right technique based on the specific objectives and available data.

    Section 5: Resources and Next Steps

    We provide attendees with valuable resources for further exploration of causal analysis techniques, including links to webinars, online courses, and templates. Additionally, we offer a glimpse into ongoing research and developments in the field of safety engineering.

    Conclusion:

    Causal analysis is a dynamic and evolving field that plays a crucial role in ensuring system reliability and safety. By employing a range of techniques and approaches, safety practitioners can gain deeper insights into the root causes of issues and implement effective risk management strategies.

    Bonus: Q&A Sessions

    Here are the two Q&A Sessions from the Webinar:

    Causal Analysis – Q&A Session 1

    Exploring Causal Analysis: Techniques and Insights – Get more here

    Learn safety engineering with me, an industry professional with 25 years of experience, I have:

    •Worked on aircraft, ships, submarines, ATMS, trains, and software;

    •Tiny programs to some of the biggest (Eurofighter, Future Submarine);

    •In the UK and Australia, on US and European programs;

    •Taught safety to hundreds of people in the classroom, and thousands online;

    •Presented on safety topics at several international conferences.