Categories
Mil-Std-882E Safety Analysis

System of Systems Hazard Analysis

In this full-length (38-minute) session, The Safety Artisan looks at System of Systems Hazard Analysis, or SoSHA, which is Task 209 in Mil-Std-882E. SoSHA analyses collections of systems, which are often put together to create a new capability, which is enabled by human brokering between the different systems. We explore the aim, description, and contracting requirements of this Task, and an extended example to illustrate SoSHA. (We refer to other lessons for special techniques for Human Factors analysis.)

This is the seven-minute demo version of the full 38-minute video.

System of Systems Hazard Analysis: Topics

  • System of Systems (SoS) HA Purpose;
  • Task Description (2 slides);
  • Documentation (2 slides);
  • Contracting (2 slides);
  • Example (7 slides); and
  • Summary.

Transcript: System of Systems Hazard Analysis

Introduction

Hello everyone and welcome to the Safety Artisan. I’m Simon and today we’re going to be talking about System of Systems Hazard Analysis – a bit of a mouthful that. What does it actually mean? Well, we shall see.

System of Systems Hazard Analysis

So, for Systems of Systems Hazard Analysis, we’re using task 209 as the description of what to do taken from a military standard, 882E. But to be honest, it doesn’t really matter whether you’re doing a military system or a civil system, whatever it might be – if you’ve got a system of systems, then this will help you to do it.

Topics for this Session

So, we look at the purpose of system of systems. By the way, if you’re wondering what that is what I’m talking about is when we take different things that we’ve developed elsewhere, e.g. platforms, electronic systems, whatever it might be, and we put them together. Usually, with humans gluing the system together somewhere, it must be said, to make it all tick and fit together.

Then we want this collection of systems to do something new, to give us some new capability, which we didn’t have before. So, that’s what I’m talking about when I say system of systems. I’ll show you an example – it’s the best way.

We’ve got a couple of slides on task description, a couple of slides or documentation, and a couple of slides on contracting. Task 209 has a very short task description, and therefore I’ve decided to go through an example. So, we’ve got seven slides of an example of a system of systems, safety case, and safety case report that I wrote. Hopefully, that will illustrate far better than just reading out the description. And that will also give us some issues that can emerge with systems of systems and I’ll summarize those at the end.

SOSHA Purpose

So, let’s get on. I’m going to call it the SOSHA for short; Systems of Systems Hazard Analysis. The purpose of the SOSHA, task 209, is to document or perform and document the analysis of the system of systems and identify unique system of systems hazards. So, things we don’t get from each system in isolation. This task is going to produce special requirements to deal with these hazards, which otherwise would not exist. Until we put the things together and start using them for something new – We’ve not done this before…

see the full transcript here.

End: System of Systems Hazard Analysis

So, that is the end of the presentation and it just remains for me to say thanks very much for watching and listening. It’s been good to spend some time with you and I look forward to talking to you next time about environmental analysis, which is Task 210 in the military standard … until then, goodbye.

Meet the Author

Learn safety engineering with me, an industry professional with 25 years of experience, I have:

•Worked on aircraft, ships, submarines, ATMS, trains, and software;

•Tiny programs to some of the biggest (Eurofighter, Future Submarine);

•In the UK and Australia, on US and European programs;

•Taught safety to hundreds of people in the classroom, and thousands online;

•Presented on safety topics at several international conferences.

Categories
Mil-Std-882E Safety Analysis

Health Hazard Analysis

In this full-length (55-minute) session, The Safety Artisan looks at Health Hazard Analysis, or HHA, which is Task 207 in Mil-Std-882E. I explore the aim, description, and contracting requirements of this complex Task. It covers: physical, chemical & biological hazards; Hazardous Materials (HAZMAT); ergonomics, aka Human Factors; the Operational Environment; and non/ionizing radiation. I will outline how to implement Task 207 in compliance with Australian WHS. (See also other lessons for specific tools and techniques, such as Human Factors analysis methods.)

This is the seven-minute-long demo. The full version is a 55-minute-long whopper!

Health Hazard Analysis: Topics

  • Task 207 Purpose;
  • Task Description;
  • ‘A Health Hazard is…’;
  • ‘HHA Shall provide Information…’;
  • HAZMAT;
  • Ergonomics;
  • Operating Environment;
  • Radiation; and
  • Commentary.

Health Hazard Analysis: Transcript

Introduction

Hello, everyone, and welcome to the Safety Artisan. I’m Simon, your host, and today we are talking about health hazard analysis.

Task 207: Health Hazard Analysis

This is Task 207 in the Mil. standard, 882E approach, which is targeted for defense systems, but you will see it used elsewhere. The principles that we’re going to talk about today are widely applicable. So, you could use this standard for other things if you wish.

Topics for this Session

We’ve got a big session today so I’m going to plough straight on. We’re going to cover the purpose of the task; and the description; the task helpfully defines what a health hazard is; and says what health hazard analysis, or HHA, shall provide in terms of information. We talk about three specialist subjects – hazardous materials or hazmat, ergonomics, and operating environment. Also, radiation is covered, as another specialist area. Then we’ll have some commentary from myself.

Now the requirements of the standard of this task are so extensive that for the first time, I won’t be quoting all of them, word for word. I’ve actually had to chop out some material, but I’ll explain that when we come to it. We can work with that but it is quite a demanding task, as we’ll see.

Task Purpose

Let’s look at the task purpose. We are to perform and document a health hazard analysis to identify human health hazards and evaluate what it says, materials and processes using materials, etc, that might cause harm to people, and to propose measures to eliminate the hazards or reduce the associated risks. In many respects, it’s a standard 882-type approach. We’re going to do all the usual things. However, as we shall see it, we’re going to do quite a lot more on this one.

Task Description #1

So, task description. We need to evaluate the potential effects resulting from exposure to hazards, and this is something I will come back to again and again. It’s very easy dealing in this area, particularly with hazardous materials, to get hung up on every little tiny amount of potentially hazardous material that is in the system or in a particular environment and I’ve seen this done to death so many times. I’ve seen it overdone in the UK when COSHH, a control of substance hazardous to health, came in in the military. We went bonkers about this. We did risk assessments up the yin-yang for stuff that we just did not need to worry about. Stuff that was in every office up and down the land. So, we need to be sensible about doing this, and I’ll keep coming back to that.

So, we need to do as it says; identification assessment, characterization, control, and communicate assets in the workplace environment. We need to follow a systems approach, considering “What’s the total impact of all these potential stressors on the human operator or maintainer?” Again, I come from a maintenance background. The operator often gets lots of attention because a) because if the operator stuffs up, you very often end up with a very nasty accident where lots of people get hurt. So, that’s a legitimate focus for a human operator of a system.

But also, a lot of organizations, the executive management tend to be operators because that’s how the organization evolves. So, sometimes you can have an emphasis on operations and maintenance and support, and other things get ignored because they’re not sexy enough to the senior management. That’s a bad reason for not looking at stuff. We need to think about the big picture, not just the people who are in control…

get the full transcript here.

End: Health Hazard Analysis

So, that is the end of the session. Thank you very much for listening. And all that remains for me to say is thanks very much for supporting the work of the Safety Artisan and tuning into this video. And I wish you every success in your work now and in the future. Goodbye.

Meet the Author

Learn safety engineering with me, an industry professional with 25 years of experience, I have:

•Worked on aircraft, ships, submarines, ATMS, trains, and software;

•Tiny programs to some of the biggest (Eurofighter, Future Submarine);

•In the UK and Australia, on US and European programs;

•Taught safety to hundreds of people in the classroom, and thousands online;

•Presented on safety topics at several international conferences.

Categories
Blog Safety Analysis

Preliminary Hazard Identification & Analysis Guide: Free

Get the Preliminary Hazard Identification & Analysis Guide for free! It’s a 50-page .pdf download, collated from reliable sources.

Contents:

  • Introduction …………………………….1
  • Aim …………………………………………1
  • Description ………………………………2
  • Method ……………………………………3
  • Guidance …………………………………4
  • Inspect the Workplace ………………7
  • How to find hazards …………………8
  • Review available information ……8
  • Consult Your Workers ……………..10
  • When to Consult with Workers …10
  • Hazard Checklists ……………………12
  • Functional Safety Analysis ……….16
  • FMEA/FMECA ……………………….21
  • SWIFT …………………………………..28
  • HAZOP ………………………………….42
Front cover of PHIA Guide
The Safety Artisan’s PHIA Guide

Preliminary Hazard Identification & Analysis Guide – Introduction

Hazard Identification has been defined as: “The process of identifying and listing the hazards and accidents associated with a system.”

Hazard Analysis has been defined as: “The process of describing in detail the hazards and accidents associated with a system and defining accident sequences.”

Preliminary Hazard Identification and Analysis (PHIA) is intended to help you determine the scope of the safety activities and requirements. It identifies the main hazards likely to arise from the capability and functionality being provided. It is carried out as early as possible in the project life cycle, providing an important early input to setting Safety requirements and refining the Project Safety Plan.

PHIA seeks to answer, at an early stage of the project, the question: “What Hazards and Accidents might affect this system and how could they happen?”

Aim

The aim of the PHIA is to identify, as early as possible, the main Hazards and Accidents that may arise during the life of the system. It provides input to:

  1. Scoping the subsequent Safety activities required in any Safety Plan. A successful PHIA will help to gauge the proportionate effort that is likely to be required to produce an effective Safety Case, proportionate to risks.
  2. Selecting or eliminating options for subsequent assessment.
  3. Setting the initial Safety requirements and criteria.
  4. Subsequent Hazard Analyses.
  5. Initiate Hazard Log.

Did You Know?

You can also get the Guide with the PHIA Triple Lesson Bundle.

Preliminary Hazard Identification & Analysis Guide: Free

Categories
Blog Tools & Techniques

Safety and Risk Audit

So, what I’m talking about today is safety and risk audit, that is about process, Q&A, and some personal experience. Also something called layered process audits, which I ran into while researching this webinar. I thought that sounded interesting – and it is! Those are today’s topics for the webinar.

Audit Process

I’m talking about the safety audit process based on the UK Acquisition Safety and Environmental Management System or ASEMs. This was developed by experts for the UK MOD, and I remember it being introduced when I used to work there.

It’s a very good system, it’s very thorough and complete. (It is effectively copyright-free, so I can share it with you, and you can access, use it, and modify it perfectly legally.)

First, we should recognize the Project Oriented Safety Management System (POSMS). It is project-oriented. So the idea is we’ve got a program, or a project, where we’re buying something – a piece of equipment or a service. We’re contracting for something. It’s a project with a beginning, a middle, and an end.

In POSMS, they refer to auditing as a ‘system audit’…

Personal Experience of Audit

Now, I’ve mentioned some personal experiences so far. But I’ve got a few specifics that I want to bring to your attention. I’m doing so on the basis of 25 years in the business of being a safety engineer (see ‘Meet the Author‘, below).

So I will talk very briefly, about safety audit, what is it really? I mean, we talked about process, the mechanics of it, but what are we trying to achieve?

When and why do we use audits? What practices should we be following? And what should we not be doing? That last one is important because it’s easy to do it wrong. Who can be an auditor?

Also, there’s a brief word about the three different terms that get commonly confused. There are Independent Safety Auditors, Independent Safety Assessors, and Independent Safety Advisors. They are all ‘ISA’s and that sometimes gets confusing. What are the differences?…

Get the Webinar

See the whole webinar at the Safety Engineering Academy. (You can get discounts on membership by subscribing to my free emails.)

Course Curriculum

There are LOTS of goodies in this one.

  1. Videos & Slides:
  2. Safety Audit Templates:
    • aap01a-f-01 Audit Schedule
    • aap01a-f-02 Audit Details Team Composition and Competence Record
    • aap01a-g-01 Audit Competency Interim Guidance
    • aap01b-f-01 Audit Plan
    • aap01b-f-02 Audit Proforma
    • aap01c-f-01 Record of Audit Meeting
    • aap01d-f-01 Audit Report Template
    • aap02-f-01 Monitoring Schedule
    • aap02-f-02 Monitoring Data – Assessment Record
    • aap03-f-01 Management Review Form
    • aap04-f-01 Non-Conformance and Corrective Action Form
  3. )

There are five videos with an hour of content, (51 videos with 8.5 hours of webinar content in total). See it all at The Safety Engineering Academy here. More content is added every month.

Meet the Author

Learn safety engineering with me, an industry professional with 25 years of experience, I have:

•Worked on aircraft, ships, submarines, ATMS, trains, and software;

•Tiny programs to some of the biggest (Eurofighter, Future Submarine);

•In the UK and Australia, on US and European programs;

•Taught safety to hundreds of people in the classroom, and thousands online;

•Presented on safety topics at several international conferences.

Categories
Mil-Std-882E Safety Analysis

Operating & Support Hazard Analysis

In this full-length session, I look at Operating & Support Hazard Analysis, or O&SHA, which is Task 206 in Mil-Std-882E. I explore Task 206’s aim, description, scope, and contracting requirements.

There’s value-adding commentary, which explains O&SHA: how to use it with other tasks; how to apply it effectively on different products; and some of the pitfalls to avoid. This is based on my 25 years in system safety and my background in operations and maintenance.

I also refer to other lessons for specific tools and techniques, such as Human Factors analysis methods.

This is the seven-minute-long demo. The full version is about 35 minutes long.

Operating & Support Hazard Analysis: Topics

  • Task 206 Purpose:
    • To identify and assess hazards introduced by O&S activities and procedures;
    • To evaluate the adequacy of O&S procedures, facilities, processes, and equipment used to mitigate risks associated with identified hazards.
  • Task Description (six slides);
  • Reporting (two slides);
  • Contracting (two slides); and
  • Commentary (four slides).

Operating & Support Hazard Analysis: Transcript

Introduction

Hello everyone and welcome to the Safety Artisan; home of safety engineering training. I’m Simon and today we’re going to be carrying on with our series on Mil. Standard 882E system safety engineering.

Operating & Support Hazard Analysis

Today, we’re going to be moving on to the subject of operating and support hazard analysis. This is, as it says, task 206 under the standard. Operating and support hazard analysis, I’ll just call it O&S or OSHA (also O&SHA) for short. Unfortunately, that will confuse people if I call OSHA. Let’s call it O&S.

Topics for this Session

The purpose of O&S hazard analysis is to identify and assess hazards introduced by those activities and procedures and to evaluate the adequacy of O&S procedures, processes, equipment, facilities, etc, to mitigate risks that have been already identified. A twofold task but a very big task. And as we’ll see, we’ve got lots of slides today on task description, and reporting, contracting, and commentary. As always, I present the full text as is of the task, which is copyright free, but I’m only going to talk about the things that are important. So, we’re not going to go through every little clause of the standard that would be pointless.

O&S Hazard Analysis (T206)

Let’s get started with the purpose. As we’ve already said, it’s to identify and assess those hazards which are introduced by operational and support activities and procedures and evaluate their adequacy. So, we’re looking at operating the system, whatever it may be- And of course, this is a military standard, so we assume a military system, but not all military systems are weapon systems by any means. Not all are physical systems.

There may be inventory management systems, management information systems, all kinds of stuff. So, does operating those systems and just supporting them, maintaining them are resupplying them, disposing of them, etc – Does that create any hazards or introduce any hazards? And how do we mitigate? That’s the purpose of the task.

Task Description (T206) #1

Let’s move on to the task description. Again, we’re assuming a contractor is performing the analysis, but that’s not necessarily the case. For this task, this actually says this typically begins during engineering and manufacturing development, or EMD.  So, we’re assuming an American style lifecycle for a big system and EMD comes after concept and requirements development. So, we are beginning to move into the very expensive stage of development for a system where we begin to commit serious money.

It’s suggesting that O&SHA can wait until then which is fine in general unless you’ve identified any particularly novel hazards that will need to be dealt with earlier on. As it says, it should build on design hazard analyses, but we’ll also talk about the case later on when there is no design hazard analyses. And the O&SHA shall identify requirements or alternatives or eliminating hazards, mitigating risks, etc. This is one of those tasks where the human is very important – In fact, dominant to be honest. Both as a source of hazards and the potential victim of the associated risks. A lot of human-centric stuff going on here.

Task Description (T206) #2

As always, we’re going to think about the system configurations. We’re going to think about what we’re going to do with the system and the environment that we’re going to do it in. So, a familiar triad and I know I keep banging on about this, but this really is fundamental to bounding and therefore evaluating safety. We’ve got to know what the system is, what we’re doing with it, and the environment in which we’re doing it. Let’s move on…

Click here to see the full transcript.

End: Operating & Support Hazard Analysis

So, that is the end of the lesson and it just remains for me to say thank you very much for your time and for listening. And I look forward to seeing you again soon. Cheers.

Meet the Author

Learn safety engineering with me, an industry professional with 25 years of experience, I have:

•Worked on aircraft, ships, submarines, ATMS, trains, and software;

•Tiny programs to some of the biggest (Eurofighter, Future Submarine);

•In the UK and Australia, on US and European programs;

•Taught safety to hundreds of people in the classroom, and thousands online;

•Presented on safety topics at several international conferences.

Categories
Mil-Std-882E Safety Analysis System Safety

System Requirements Hazard Analysis

In this 45-minute session, I’m looking at System Requirements Hazard Analysis, or SRHA, which is Task 203 in the Mil-Std-882E standard. I will explore Task 203’s aim, description, scope, and contracting requirements.  SRHA is an important and complex task, which must be done on several levels to succeed.  This video explains the issues and discusses how to perform SRHA well.

This is the seven-minute demo video, the full version is 40 minutes’ long.

Topics: System Requirements Hazard Analysis

  • Task 202 Purpose;
  • Task Description:
    • Determine Requirements;
    • Incorporate Requirements; and
    • Assess the compliance of the System.
  • Contracting;
  • Section 4.2 (of the standard); and
  • Commentary.

Transcript

Introduction

Hello and welcome to the Safety Artisan, where you will find professional, pragmatic and impartial advice on all things system, safety and related.

System Requirements Hazard Analysis

Today, we’re talking about system requirements hazard analysis. And this is part of our series on Mil. Standard 882E, and this one is Task 203. And it’s a very widely used system safety engineering standard. Its influence is found in many places, not just in military procurement programs.

Topics for this Session

We’re looking at this task, which is very important, possibly the most important task of all, as we’ll see. I’m talking about the purpose of the task, which is word-for-word from the task description itself.

We’re talking about in the task description, the three aims of this task, which is to determine or work out requirements, incorporate them, and then assess the compliance of the system with those requirements, because, of course, it may not be a simple read-across. We’ve got six slides on that. That’s most of the task.

Then we’ve just got one slide on contracting, which if you’ve seen any of the others in this series, will seem very familiar. We’ve got a bit of a chat about Section 4.2 from the standard and some commentary, and the reason for that will become clear. Let’s crack on!

System Requirements Hazard Analysis

Task 203.1, the purpose of Task 203 is to perform and document a System Requirements Hazard Analysis or SRHA. And as we’ve already said, the purpose of this is to determine the design requirements. We’re going to focus on design rather than buying stuff off the shelf – we’ll talk about the implications of that a little bit later.

Design requirements to eliminate or reduce hazards and risks, incorporate those requirements, into a says, into the documentation, but what it should say is incorporate risk reduction measures into the system itself and then document it.

Finally, to assess compliance of the system with these requirements. Then it says the SRHA address addresses all life-cycle phases, so not just meant for you to think about certain phases of the program. What are the requirements through life for the system? And in all modes. Whether it’s in operation, whether it’s in maintenance or refit, whether it’s being repaired or disposed of, whatever it might be.

Task Description #1

The first of six slides is the task description. I’m using more than one colour because there’s some quite a lot of important points packed quite tightly together in this description.

We’re assuming that the contractor performs and documents this SRHA. The customer needs to do a lot of work here before ever gets near a contractor. More on that later. We need to determine system design requirements to eliminate hazards or reduce associated risks.

Two things here. By identifying applicable policies, regulations, standards, etc. More on that later. And analyzing identified hazards. So, requirements to perform the analysis as well as to simply just state ‘We want a system to do this and not to do that’. So, we need to put some requirements to say ‘Here’s what we want analyzed maybe to what degree? And why.’ is always helpful.

Task Description #2

Breaking those breaking those two requirements down.

Part a. We identify applicable requirements by reviewing our military and industry standards and specs, and historical documentation of systems that are similar or with a system that we’re replacing, perhaps. It’s assumed that the US Department of Defense is the customer, the ultimate customer. So, the ultimate customer’s requirements, including whatever they’ve said about standard ways of mitigating certain common risks.

The system performance spec, that’s your functional performance spec or whatever you want to call it. Other system design requirements and documents – a bit of a catchall there. And applicable federal, military, state, and local regulations.

This is a US standard. It’s a federated state, much like Australia and lots of modern states, even the UK. There are variations in law across England, Wales, Scotland and Ireland. They’re not great, but they do exist.

And in the US and Australia, those differences are greater. And it says applicable executive orders. Executive orders, they’re not law, but they are what the executive arm of the U.S. government has issued, and international agreements. There are a lot of words in there – have a look at the different statements that are in white, blue, and yellow.

Basically, from international agreements right down to whatever requirements may be applicable, they all need to be looked at and accounted for. So, there’s a huge amount of work there for someone to do. I’ll come back to who that someone should be later.

End: System Requirements Hazard Analysis

You can find a free pdf of the System Safety Engineering Standard, Mil-Std-882E, here.

Meet the Author

Learn safety engineering with me, an industry professional with 25 years of experience, I have:

•Worked on aircraft, ships, submarines, ATMS, trains, and software;

•Tiny programs to some of the biggest (Eurofighter, Future Submarine);

•In the UK and Australia, on US and European programs;

•Taught safety to hundreds of people in the classroom, and thousands online;

•Presented on safety topics at several international conferences.

Categories
Blog Functional Safety System Safety

Identify and Analyze Functional Hazards

So, how do we identify and analyze functional hazards? I’ve seen a lot of projects and programs. We’re great at doing the physical hazards, but not so good at the functional hazards.

Introduction: Identify and Analyze Functional Hazards

So, when I talk about physical and functional hazards, the physical stuff, I think we’re probably all very familiar with them. They’re all to do with energy and toxicity.

Physical Hazards

So with energy, it might be fire, it might be electric shock. Potential energy, the potential energy of someone at height, or something falling. The impact of the kinetic energy. And then of course, in terms of toxicity, we’ve got hazardous chemicals, which we have to deal with. And then we’ve got biological hazards, plus smoke and toxic gasses, often from fires. Or chemical reactions.

So those are your physical hazards. As I said, we tend to be good at dealing with those. We’re used to dealing with that stuff. And most projects I’ve been on have been pretty good at identifying and analyzing that stuff. Not so for functional hazards.

Functional Hazards

I’ve been on lots of projects still today where functional hazards are just ignored completely or they’re only dealt with partially. So let’s explain what I mean about functional hazards. What we’re talking about is where a system is required to do something to perform some function. For example, cars move. They start, they move and they stop, hopefully.

Loss of Function

But what happens when those functions go wrong? What happens when we don’t get the function when we need it? The brakes fail on your car, for example. And so that’s a fairly obvious one. When functional hazards are looked at, it’s usually the functional failures that get attention.

But if that is the obvious failure mode, the less obvious failure modes tend to be more dangerous and there are the two.

Other Functional Failure Modes

So what happens if things work when they shouldn’t? What if you’re driving along on a road or the motorway, perhaps at high speed, and your brakes slam on for no apparent reason? Perhaps there is somebody behind you. Do you have a collision or do you lose control on the road and crash?

What if the function works, but it works incorrectly? For example, you turn the temperature down but instead, it goes up. Or you steer to the left, but instead, your vehicle goes to the right.

What if a display shows the wrong information? If you’re in a plane, maybe you’ve got an altimeter that tells you how high you are. It would be dangerous if the altimeter told you that you were level or climbing, but you were descending towards the ground. Yeah, we’ve had lots of that kind of accident.

So there’s an overview of what I mean by physical and functional hazards.

The Webinar: Identify and Analyze Functional Hazards

See the whole webinar at the Safety Engineering Academy. (You can get discounts on membership by subscribing to my free emails.)

Course Curriculum

  1. Introduction
  2. Preliminary Hazard Identification (PHI)
  3. Functional Failure Analysis
  4. Functional Hazard Analysis (FHA)

There are 11 lessons with two-and-a-half hours of video content, plus other resources. See the Foundations of System Safety here.

Meet the Author

Learn safety engineering with me, an industry professional with 25 years of experience, I have:

•Worked on aircraft, ships, submarines, ATMS, trains, and software;

•Tiny programs to some of the biggest (Eurofighter, Future Submarine);

•In the UK and Australia, on US and European programs;

•Taught safety to hundreds of people in the classroom, and thousands online;

•Presented on safety topics at several international conferences.

Categories
Blog System Safety

Foundations of System Safety

So today, we’re talking about the Foundations of System Safety assessment. And as it says, it’s a free webinar from The Safety Artisan, and it’s one of a series.

Webinar Highlights

So, before we go on, I’ll just introduce myself.

Why should you bother to listen to me?

Well, in 25 years of experience in system safety, I’ve worked on a lot of different stuff: aircraft, fast jets, big aircraft, helicopters, reconnaissance, and EW platforms; surface ships and submarines; air traffic management systems; a little bit on trains and road vehicles; and lots and lots of software.

And I worked on some nice little programs, which is great. That’s always good fun. And some enormous programs, not all of which succeeded.

So you get a range of perspectives from me on that, and you get to learn from other people’s mistakes. Bismarck said that was a good idea because we don’t have time to make all the mistakes in one lifetime.

I worked in the UK for many years and now 10 years in Australia. And I’ve worked on introducing a lot of US and European programs to those countries.

It’s a wide range of experiences. I’ve had the privilege of teaching safety to hundreds of people in the classroom, and thousands online. And I’ve also been lucky enough to present on safety topics at several international conferences.

However, the proof of the pudding is in the eating, as they say. So let’s get on with it. So, the webinar topic is the Foundations of System Safety. So, what are they, and how do we set them up for a successful project? That’s what we want.

The Webinar: Foundations of System Safety

See the whole webinar at the Safety Engineering Academy. (You can get discounts on membership by subscribing to my free emails.)

Course Curriculum

  1. Introduction
  2. Preliminary Hazard Identification (Task 201)
  3. Preliminary Hazard Analysis (Task 202)
  4. System Requirements Hazard Analysis (Task 203)
  5. Safety Analysis Techniques Overview

There are 18 lessons with four hours of video content, plus other resources. See the Foundations of System Safety here.

Meet the Author

Learn safety engineering with me, an industry professional with 25 years of experience, I have:

•Worked on aircraft, ships, submarines, ATMS, trains, and software;

•Tiny programs to some of the biggest (Eurofighter, Future Submarine);

•In the UK and Australia, on US and European programs;

•Taught safety to hundreds of people in the classroom, and thousands online;

•Presented on safety topics at several international conferences.

Categories
Blog Safety Analysis

Failure Mode Effects Analysis

TL;DR This article on Failure Mode Effects Analysis explains this powerful and commonly used family of techniques. You can access this webinar (and all the others) here.

I have used FMEA and related techniques on many programs and it can produce powerful results quickly and cheaply. Recently, I’ve seen some criticism of FEMA on social media. However, I’m convinced that this is only clickbait. The secret of success is to understand what a technique is good for – and not – and to apply it well. It’s as simple as that!

This article covers:

  • A description of the technique, including its purpose;
  • When it might be used;
  • Advantages, disadvantages and limitations;
  • Sources of additional information;
  • A simple example of an FMEA/FMECA; and
  • Additional comments.

I’ve added some ‘top tips’ of my own based on my personal experience in the industry.

Top Tip

In this article, I have used material from a UK Ministry of Defence guide, reproduced under the terms of the UK’s Open Government Licence.

A Description of the Technique, Including Its Purpose

Failure modes and effects analysis (FMEA) was one of the first systematic techniques for failure analysis. It was developed in the United States military (Military Procedure MIL-P-1629, titled ‘Procedures for Performing a Failure Modes, Effects and Criticality Analysis’, November 9, 1949) as a reliability evaluation technique to determine the effect of system and equipment failures. Failures were classified according to their impact on mission success and personnel, equipment, and safety. In the 1960’s it was used by the aerospace industry and NASA during the Apollo program. More and more industries – notably the automotive industry – have seen the benefits to be gained by using FMEAs to complement their design processes.

This qualitative technique helps identify failure potential in a design or process i.e. to foresee failure before it actually happens. This is done by defining the system that is under consideration to ensure system boundaries are established and then by following a procedure, which helps to identify design features or process operations that could fail. The procedure requires the following essential questions to be asked:

  • How can each component fail?
  • What might cause these modes of failure?
  • What could the effects be if these failures did occur?
  • How serious are these failure modes?
  • How is each failure mode detected?
  • What are the safeguards in place to protect against accidents resulting from the failure mode?

As always with safety analyses, the more precisely you can answer these questions (above), the better the results you will get.

Top Tip

As an aid in structuring the analysis and ensuring a systematic approach, results are recorded in a tabular format. Several different forms are in use, and the form design can be tailor-made to suit the particular requirements of a study. Examples of forms can be found in several standards (links below).

Make the form support the flow of the process, left-to-right, then top-down!

Top Tip

The FMEA analysis can be extended if necessary by characterizing the likelihood, severity, and resulting levels of risk of failures. FMEAs that incorporate this criticality analysis (CA) are known as FMECAs. A FMECA is an analytical quantitative technique, which ranks failure modes according to their probability and consequences (i.e. the resulting effect of the failure mode on the system, mission, and personnel). It is referred to as a “bottom-up approach” as it starts by identifying the potential failure modes of a component and analyzing their effects on the whole system. It can be quite complex depending on how the user drives the technique.

It is important to note that the FMECA does not provide a model by which system reliability can be quantified. Hence, if the objective is to estimate the probability of events, a technique that results in a logic model of the failure mechanisms must be employed, typically a fault tree and/or an event tree.

Reliability Block Diagrams, or for repairable systems, Markov Chains can also be used.

Top Tip

A FMEA or FMECA can be conducted on either a component or a functional level. A functional FMEA/FMECA only covers hardware aspects but a functional FMEA/FMECA can cover all aspects of a system. For either approach, the general principle remains the same.

When it Might be Used

FMEA is applicable for any well-defined system but is primarily used for reviews of mechanical and electrical systems. It can be used in many situations, for example, to assess the design of a product in terms of what could go wrong in manufacturing and in-service as a result of the weakness in the design. It can also be used to analyze failures in the manufacturing process itself and during service. It is effective for collecting information needed to troubleshoot system problems and improving maintenance and reliability of plant and equipment (defining and optimizing) as it focuses directly and individually on equipment failure modes.

It’s fair to say that you need a design, on which to perform a FMEA. Pre-design you could use Functional Failure Analysis (FFA) instead.

Top Tip

The FMECA technique is best suited for detailed analysis of system hardware, and should preferably be carried out by the designer in parallel with system development. This will not only speed up the analysis itself, but also force the design team to think systematically about the failure characteristics of the system. The primary use of the FMECA is in verifying that single component failures cannot cause a catastrophic system failure.

There are a number of areas today in which the use of FMECA has become mandatory to demonstrate system reliability. Examples of such requirements are in the classification of Dynamically Positioned (DP) vessels and in a number of US military applications for which MIL-STD documents apply.

Advantages, Disadvantages, and Limitations

Advantages

  • It is widely-used and well-understood, and easy to understand and interpret
  • It can be performed by a single analyst, or more if required
  • Qualitative data about the causes and effects can be incorporated into the analysis
  • It is systematic and comprehensive, and should identify hazards with an electrical or mechanical basis
  • The level of detail incorporated can be varied to suit the analysis
  • It identifies safety-critical equipment where a single failure would be critical for the system
  • Even though the technique can be quite time consuming it can lead to a thorough understanding of the system being considered

Disadvantages

  • The technique adopts a bottom-up approach and if conducting a component level FMEA or FMECA this can be boring and repetitive
  • The benefit gained is dependent upon the experience of the analyst or the group.
  • It requires a hierarchical system drawing as the basis for the analysis, which the analyst usually has to develop before the FMEA process can start
  • It is optimised for mechanical and electrical equipment, and does not apply easily to Human Factor Integration, procedures or process equipment
  • It is difficult for the technique to cover multiple failures as equipment failures are generally analysed one by one therefore important combinations of equipment failures may be overlooked
  • Most accidents have a significant human or external influence contribution and these are not a usual failure mode with FMEA
  • More than one FMEA may be required for a system with multiple modes of operation
  • Due to its wide use there can be temptation to read across data from ARM or ILS projects where, for example, the fault-tree technique has been used. As a consequence, the safety perspective can be lost as human error has been excluded and the focus has been solely on determining faults and on not on more far-reaching safety issues
  • Perhaps the worst drawback of the technique is that all component failures are examined and documented, including those, which do not have any significant consequences.
  • For large systems, especially those with a fair degree of redundancy built into them, the amount of unnecessary documentation is a major disadvantage. Hence, the FMECA should primarily be used by designers of reasonably simple systems. It should however be noted that the concept of the FMECA form can be quite useful in other contexts, e.g. when reviewing an operation rather than a hardware system. Then the use of a form similar to the FMECA can provide a useful way of documenting the analysis. Suitable columns in the form could for example include; operation, deviation, consequence, correcting or reversing action, etc.

ARM = Availability, Reliability, Maintainability
ILS = Integrated Logistic Support (or logistics engineering
)

Top Tip

Sources of Additional Information, such as Standards, Textbooks and Websites

BS 5760: Part 5 Reliability of Systems, Equipment and Components: Part 5 Guide to Failure Modes, Effects and Criticality Analysis.

HSE Website – Marine Risk Assessment, Offshore Technology Report 2001/063

IEC 60812:2018 Failure modes and effects analysis (FMEA and FMECA)

As always, Understand your Standard (what it was designed to do) to get the best out of it!

Top Tip

A Simple Example of an FMEA/FMECA

An example extract from an FMEA of a ballast system is shown below. This can be found in the HSE Marine Risk Assessment Report. The column headings are based on the US Military Standard Mil-Std 1629A, but with modifications to suit the particular application. For example, the failure mode and cause columns are combined. The criticality of each failure is ranked as minor, incipient, degraded, or critical.

An example of a FMEA Output Table

To properly understand these results you need to know how a Sea Chest works (see context here). Otherwise the example just shows what kind of output a FMEA can produce.

Top Tip

Additional comments

Failure Modes and Effects and Criticality Analysis (FMECA) is an analytical QRA technique, used by ARM and ILS systems engineers, most commonly and effectively at the late design, test and manufacture stage of a project. It requires the breakdown of the system into individual components and the identification of possible failure modes or malfunctions of each component, (such as too much flow through a valve). Referred to as a bottom-up approach, it starts by identifying the potential failure modes of a component and analyzing their potential effects on the whole system. Numerical levels can be assigned to the likelihood of the failure and the severity or consequence of the failure.

Note: It is important to recognize that FMEA/FMECA Standards have different approaches to criticality. Failure mode severity classes 1 – 5 for Standards MIL1629A and ARP926A go from Class 1 being the most severe (e.g. loss of life) to Class 5 being less severe (i.e. no effect), whereas BS 5760 deals with criticality in the opposite direction where Class 5 is the most severe.

Note that FMECA for ARM/ILS looks at availability or mission criticality, not safety criticality.  A FMECA for safety will have a different focus.

Top Tip

Software:

  • Isograph;
  • Reliability Work Bench;
  • Reliasoft;
  • Microsoft Excel.

These are not recommendations!

FMEA/FMECA tables for complex systems can run to hundreds of pages, so good tool support is essential.

Top Tip

Failure Mode Effects Analysis: Have You Used this Technique?

Back to the Safety Assessment topic page.

Meet the Author

Learn safety engineering with me, an industry professional with 25 years of experience, I have:

•Worked on aircraft, ships, submarines, ATMS, trains, and software;

•Tiny programs to some of the biggest (Eurofighter, Future Submarine);

•In the UK and Australia, on US and European programs;

•Taught safety to hundreds of people in the classroom, and thousands online;

•Presented on safety topics at several international conferences.

Categories
Blog Safety Analysis Tools & Techniques

Five Ways to Identify Hazards

In my webinar ‘Five Ways to Identify Hazards’ I look at a mix of techniques. We need these diverse techniques to assure us (give justified confidence) that we have identified the full range of hazards associated with a system.

To do this I draw on my 25 years of experience (see ‘Meet the Author‘, below) and relevant standards. Here’s the introduction to the webinar.

Five Ways to Identify Hazards: Video Introduction

Webinar: ‘Five Ways to Identify Hazards’

Four Things to Remember

For hazard identification, we need to be aware of four things.

What we’re doing is we are imagining what could go wrong. And I want to emphasize, first of all, imagination. We need to be open to what could happen. That’s the mindset that we need, and we’re looking at what could go wrong, not what will go wrong. Think about possibilities, not certainties.

The second thing is that it’s very easy to dive down a rabbit hole and get into mega detail about one particular thing and spend lots of time, waste lots of time doing that. That’s not what we need to be doing. We need a broad approach. We need to go wide and think about as many different possible hazards as we can. Don’t dive deep that will come later, the deep analysis will come later.

Another aspect of that point is we’re talking about hazard identification. We’re just here to identify hazards. We’re not here to try to assess them yet.

Yet another mistake that people make is to try and jump straight to fixing the hazard. Many of us watching will be engineers. We love fixing problems. We like to solve problems, but we’re not here to solve the problem yet. We’re only here to identify it. So we’re going to avoid the temptation to jump in and try and come up with a solution. That’s not what we’re doing with hazard identification.

So those are four things to bear in mind.

Five Ways to Identify Hazards

Let’s move on. So I’ve said that this was entitled five ways to identify hazards.

There are, of course, many ways to identify hazards, but I just thought I’d pick on these five because there was a nice broad range of things and things that I can show you how to do straight away.

Those are the five things that we’ve got and we’ll have a slide on each one of those. First, we can ask the workers or end users or their representatives. Secondly, we can inspect the workplace, we can look around for hazards. And maybe we’ve got a real workplace that we can look at or maybe we’ve just got a representation, we can do both.

We can use a hazard identification checklist, we can survey historical data. So all the squiggly lines at the bottom of the screen, there’s an example of some historical data and we can conduct a number of analyses on that.

But the analysis I picked on (Number 5) is Functional Failure Analysis and we’ll see why in just a moment. So those are the five things that we will cover in the next hour. We’ll also have time for a Question and Answer session and then a worked example of how to do a simple Functional Failure Analysis…

There’s More!

This is just one of many webinars in my Safety Engineering Academy. You can see summaries of them all in this blog post.

Meet the Author

Learn safety engineering with me, an industry professional with 25 years of experience, I have:

•Worked on aircraft, ships, submarines, ATMS, trains, and software;

•Tiny programs to some of the biggest (Eurofighter, Future Submarine);

•In the UK and Australia, on US and European programs;

•Taught safety to hundreds of people in the classroom, and thousands online;

•Presented on safety topics at several international conferences.