Categories
Mil-Std-882E Safety Analysis System Safety

System Requirements Hazard Analysis

In this 45-minute session, I’m looking at System Requirements Hazard Analysis, or SRHA, which is Task 203 in the Mil-Std-882E standard. I will explore Task 203’s aim, description, scope, and contracting requirements.  SRHA is an important and complex task, which needs to be done on several levels to be successful.  This video explains the issues and discusses how to perform SRHA well.

This is the seven-minute demo video, the full version is 40 minutes’ long.

Topics: System Requirements Hazard Analysis

  • Task 202 Purpose;
  • Task Description:
    • Determine Requirements;
    • Incorporate Requirements; and
    • Assess the compliance of the System.
  • Contracting;
  • Section 4.2 (of the standard); and
  • Commentary.
Transcript: System Requirements Hazard Analysis

Introduction

Hello and welcome to the Safety Artisan, where you will find professional, pragmatic and impartial advice on all things system, safety and related.

System Requirements Hazard Analysis

And so today, which is the 1st of March 2020, we’re going to be talking about – let me just find it for you – we’ll be talking about system requirements, hazard analysis. And this is part of our series on Mil. Standard 882E (882 Echo) and this one a task 203. Task 203 in the Mil. standard. And it’s a very widely used system safety engineering standard and its influence is found in many places, not just on military procurement programs.

Topics for this Session

We’re going to look at this task, which is very important, possibly the most important task of all, as we’ll see. so in to talk about the purpose of the task, which is word for word from the task description itself. We’re going to talk about in the task description, the three aims of this task, which is to determine or work out requirements, incorporate them, and then assess the compliance of the system with those requirements, because, of course, it may not be a simple read-across. We’ve got six slides on that. That’s most of the task. Then we’ve just got one slide on contracting, which if you’ve seen any of the others in this series, will seem very familiar. We’ve got a little bit of a chat about Section 4.2 from the standard and some commentary, and the reason for that will become clear. So, let’s crack on.

System Requirements Hazard Analysis

Task 203.1, the purpose of Task 203 is to perform and document a System Requirements Hazard Analysis or SRHA. And as we’ve already said, the purpose of this is to determine the design requirements. We’re going to focus on design rather than buying stuff off the shelf – we’ll talk about the implications of that a little bit later. Design requirements to eliminate or reduce hazards and risks, incorporate those requirements, into a says, into the documentation, but what it should say is incorporate risk reduction measures into the system itself and then document it. And then finally, to assess compliance of the system with these requirements. Then it says the SRHA address addresses all life-cycle phases, so not just meant for you to think about certain phases of the program. What are the requirements through life for the system? And in all modes. Whether it’s in operation, whether it’s in maintenance or refit, whether it’s being repaired or disposed of, whatever it might be.

Task Description #1

First of six slides on the task description. I’m using more than one colour because there’s some quite a lot of important points packed quite tightly together in this description. We’re assuming that the contractor performs and documents this SRHA. The customer needs to do a lot of work here before ever gets near a contractor. More on that later. We need to determine system design requirements to eliminate hazards or reduce associated risks.

Two things here. By identifying applicable policies, regulations and standards etc. More on that later. And analysing identified hazards. So, requirements to perform the analysis as well as to simply just state ‘We want a system to do this and not to do that’. So, we need to put some requirements to say ‘Here’s what we want analysed maybe to what degree? And why.’ is always helpful.

Task Description #2

Breaking those breaking those two requirements down.

Part a. We’re going to identify applicable requirements by reviewing our military and industry standards and specs, historical documentation of systems that are similar or with a system that we’re replacing, perhaps. Look at, it’s assumed that the US Department of Defense is the customer, ultimate customer. So, the ultimate customer’s requirements, including whatever they’ve said about standard ways of mitigating certain common risks. System performance spec, that’s your functional performance spec or whatever you want to call it. Other system design requirements and documents- Bit of a catchall there. And applicable federal, military, state and local regulations. This is a US standard. It’s a federated system, much like Australia or indeed lots of modern states, even the UK. There are variations in law across England, Wales, Scotland and Ireland. They’re not great, but they do exist. And in the US and Australia, those differences are greater. And it says applicable executive orders. Executive orders, they’re not law, but they are what the executive arm of the U.S. government has issued, and international agreements. A lot of words in there- have a look at the different statements that are in that in white, blue and yellow. Basically, from international agreements right down to whatever requirements may be applicable, they all need to be looked at and taken account of. So, there’s a huge amount of work there for someone to do. I’ll come back to who that someone should be later.

End

Well, that is the end of the presentation. And it just remains for me to say thanks again for watching and do lookout for the next sessions in the series on 882 echo (882E). There are quite a few to go. We’re going to go through all the tasks and the general and specific requirements of the standard and the appendices. We will also talk about more advanced topics, about how we manage and apply all this stuff.

So, from The Safety Artisan, thanks very much and goodbye.

End: System Requirements Hazard Analysis

You can find a free pdf of the System Safety Engineering Standard, Mil-Std-882E, here.

Categories
Blog System Safety

System Safety Principles

In this 45-minute video, I discuss System Safety Principles, as set out by the US Federal Aviation Authority in their System Safety Handbook. Although this was published in 2000, the principles still hold good (mostly) and are worth discussing. I comment on those topics where the modern practice has moved on, and those jurisdictions where the US approach does not sit well.

This is the ten-minute preview of the full, 45-minute video.

System Safety Principles: Topics

  • Foundational statement
  • Planning
  • Management Authority
  • Safety Precedence
  • Safety Requirements
  • System Analyses Assumptions & Criteria
  • Emphasis & Results
  • MA Responsibilities
  • Software hazard analysis
  • An Effective System Safety Program

System Safety Principles: Transcript

Click here for the Transcript

Hello and welcome to The Safety Artisan where you will find professional pragmatic and impartial educational products. I’m Simon and it’s the 3rd of November 2019. Tonight I’m going to be looking at a short introduction to System Safety Principles.

Introduction

On to system safety principles; in the full video we look at all principles from the U.S. Federal Aviation Authority’s System Safety Handbook but in this little four- or five-minute video – whatever it turns out to be – we’ll take a quick look just to let you know what it’s about.

Topics for this Session

These are the subjects in the full session. Really a fundamental statement; we talk about planning; talk about the management authority (which is the body that is responsible for bringing into existence -in this case- some kind of aircraft or air traffic control system, something like that, something that the FAA would be the regulator for in the US). We talk about safety precedents. In other words, what’s the most effective safety control to use. Safety requirements; system analyses – which are highlighted because that’s just the sample I’m going to talk about, tonight; assumptions and safety criteria; emphasis and results – which is really about how much work you put in where and why; management authority responsibilities; a little aside of a specialist area – software hazard analysis; And finally, what you need for an effective System Safety Program.

Now, it’s worth mentioning that this is not an uncritical look at the FAA handbook. It is 19 years old now so the principles are still good, but some of it’s a bit long in the tooth. And there are some areas where, particularly on software, things have moved on. And there are some areas where the FAA approach to system safety is very much predicated on an American approach to how these things are done.  

Systems Analysis

So, without further ado, let’s talk about system analysis. There are two points that the Handbook makes. First of all, that these analyses are basic tools for systematically developing design specifications. Let’s unpack that statement. So, the analyses are tools- they’re just tools. You’ve still got to manage safety. You’ve still got to estimate risk and make decisions- that’s absolutely key. The system analyses are tools to help you do that. They won’t make decisions for you. They won’t exercise authority for you or manage things for you. They’re just tools.

Secondly, the whole point is to apply them systematically. So, coverage is important here- making sure that we’ve covered the entire system. And also doing things in a thorough and orderly fashion. That’s the systematic bit about it. And then finally, it’s about developing design specifications. Now, this is where the American emphasis comes in. But before we talk about that, it’s fundamental to note that really we need to work out what our safety requirements are. What are we trying to achieve here with safety? And why? And those are really important concepts because if you don’t know what you’re trying to achieve then it will be very difficult to get there and to demonstrate that you’ve got there- which is kind of the point of safety. And putting effort into getting the requirements right is very important because without doing that first step all your other work could be invalid. And in my experience of 20 plus years in the business, if you don’t have a really precise handle on what you’re trying to achieve then you’re going to waste a lot of time and money, probably.

So, onto the second bullet point. Now the handbook says that the ultimate measure of safety is not the scope of analysis but in satisfying requirements. So, the first part – very good. We’re not doing analysis for the sake of it. That’s not the measure of safety – that we’ve analyzed something to death or that we’ve expended vast amounts of dollars on doing this work but that we’ve worked out the requirements and the analysis has helped us to meet them. That is the key point.

This is where it can go slightly pear-shaped in that this emphasis on requirements (almost to the exclusion of anything else) is a very U.S.-centric way of doing things. So, very much in the US, the emphasis is you meet the spec, you certify that you’ve met spec and therefore we’re safe. But of course what if the spec is wrong? Or what if it’s just plain inappropriate for a new use of an existing system or whatever it might be?

In other jurisdictions, notably the U.K. (and as you can tell from my accent that’s where I’m from,  I’ve got a lot of experience doing safety work in the U.K. but also Australia where I now live and work) it’s not about meeting requirements. Well, it is but let me explain. In the UK and Australia, English law works on the idea of intent. So, we aim to make something safe: not whether it has that it’s necessarily met requirements or not, that doesn’t really matter so much, but is the risk actually reduced to an acceptable level? There are tests for deciding what is acceptable. Have you complied with the law? The law outside the US can take a very different approach to “it’s all about the specification”.

Of course, those legal requirements and that requirement to reduce risk to an acceptable level, are, in themselves, requirements. But in Australian or British legal jurisdiction, you need to think about those legal requirements as well. They must be part of your requirements set. So, just having a specification for a technical piece of cake that ignores the requirements of the law, which include not only design requirements but the thing is actually safe in service and can be safely introduced, used, disposed of, etc. If you don’t take those things into account you may not meet all your obligations under that system of law. So, there’s an important point to understanding and using American standards and an American approach to system safety out of the assumed context. And that’s true of all standards and all approaches but it’s a point I bring out in the main video quite forcefully because it’s very important to understand.

Copyright Statement

So, that’s the one subject I’m going to talk about in this short video. I’d just like to mention that all quotations are from the FAA system safety handbook which is copyright free but the content of this video presentation, including the added value from my 20 plus years of experience, is copyright of the Safety Artisan.

For More…

And wherever you’re seeing this video, be it on social media or whatever, you can see the full version of the video and all other videos at The Safety Artisan.

End

That’s the end of the show. It just remains to me to say thanks very much for giving me your time and I look forward to talking to you again soon. Bye-bye.

Back to the Start Here Page.

Categories
Blog System Safety

Safety Concepts Part 2

In this 33-minute session, Safety Concepts Part 2, The Safety Artisan equips you with more Safety Concepts. We look at the basic concepts of safety, risk, and hazard in order to understand how to assess and manage them. Exploring these fundamental topics provides the foundations for all other safety topics, but it doesn’t have to be complex. The basics are simple, but they need to be thoroughly understood and practiced consistently to achieve success. This video explains the issues and discusses how to achieve that success.

This is the full (33 minute) Safety Concepts, Part 2 video.

Safety Concepts Part 2: Topics

  • Risk & Harm;
  • Accident & Accident Sequence;
  • (Cause), Hazard, Consequence & Mitigation;
  • Requirements / Essence of System Safety;
  • Hazard Identification & Analysis;
  • Risk Reduction / Estimation;
  • Risk Evaluation & Acceptance;
  • Risk Management & Safety Management; and
  • Safety Case & Report.

Safety Concepts Part 2: Transcript

Click Here for the Transcript

Hi everyone, and welcome to the safety artisan where you will find professional, pragmatic, and impartial advice on safety. I’m Simon, and welcome to the show today, which is recorded on the 23rd of September 2019. Today we’re going to talk about system safety concepts. A couple of days ago I recorded a short presentation (Part 1) on this, which is also on YouTube.  Today we are going to talk about the same concepts but in much more depth.

In the short session, we took some time picking apart the definition of ‘safe’. I’m not going to duplicate that here, so please feel free to go have a look. We said that to demonstrate that something was safe, we had to show that risk had been reduced to a level that is acceptable in whatever jurisdiction we’re working in.

And in this definition, there are a couple of tests that are appropriate that the U.K., but perhaps not elsewhere. We also must meet safety requirements. And we must define the Scope and bound the system that we’re talking about a Physical system or an intangible system like a computer program. We must define what we’re doing with it and what it’s being used for. And within which operating environment within which context is being used.  And if we could do all those things, then we can objectively say – or claim – that the system is safe.

Topics

We’re going to talk about a lot more Topics. We’re going to talk about risk accidents. The cause has a consequence sequence. They talk about requirements and. Spoiler alert. What I consider to be the essence of system safety. And then we’ll get into talking about the process. Of demonstrating safety, hazard identification, and analysis.

Risk Reduction and estimation. Risk Evaluation. And acceptance. And then pulling it all together. Risk management safety management. And finally, reporting, making an argument that the system is safe supporting with evidence. And summarizing all of that in a written report. This is what we do, albeit in different ways and calling it different things.

Risk

Onto the first topic. Risk and harm.  Our concept of risk. It’s a combination of the likelihood and severity of harm. Generally, we’re talking about harm. To people. Death. Injury. Damage to help. Now we might also choose to consider any damage to property in the environment. That’s all good. But I’m going to concentrate on harm to people. Because usually, that’s what we’re required to do. By the law. And there are other laws covering the environment and property sometimes. That. We’re not going to talk.  just to illustrate this point. This risk is a combination of Severity and likelihood.

We’ve got a very crude. Risk table here. With a likelihood along the top. And severity. Downside. And we might. See that by looking at the table if we have a high likelihood and high severity. Well, that’s a high risk. Whereas if we have Low Likelihood and low severity. We might say that’s a low risk. And then. In between, a combination of high and low we might say that’s medium. Now, this is a very crude and simple example. Deliberately.

You will see risk matrices like this. In. Loads of different standards. And you may be required to define your own for a specific system, there are lots of variations on this but they’re all basically. Doing this thing and we’re illustrating. How do we determine the level of risk. By that combination of severity. And likely, I think a picture is worth a thousand words. Moving online to the accident. We’re talking about (in this standard) an unintended event that causes harm.

Accidents, Sequences and Consequences

Not all jurisdictions just consider accidental events, some consider deliberate harm as well. We’ll leave that out. A good example of that is work health and safety in Australia but no doubt we’ll get to that in another video sometime. And the accident sequences the progression of events. That results in an accident that leads to an. Now we’re going to illustrate the accident sequence in a moment but before we get there. We need to think about cousins.  here we’ve got a hazardous physical situation or state of a system. Often following some initiating event that may lead to an accident, a thing that may cause harm.

And then allied with that we have the idea of consequences. Of outcomes or an outcome. Resulting from. An. Event. Now that all sounds a bit woolly doesn’t it, let’s illustrate that. Hopefully, this will make it a lot clearer. Now. I’ve got a sequence here. We have. Causes. That might lead to a hazard. And the hazard might lead to different consequences. And that’s the accident. See. Now in this standard, they didn’t explicitly define causes.

Cause, Hazard, and Consequence

They’re just called events. But most mostly we will deal with causes and consequences in system safety. And it’s probably just easier to implement it. Whether or not you choose to explicitly address every cause. That’s often an optional step. But this is the accident Sequence that we’re looking at. These sorts of funnels are meant to illustrate the fact that they may be many causes for one hazard. And one has it may lead to many consequences on some of those consequences. Maybe. No harm at all.

We may not actually have an accident. We may get away with it. We may have a. Hazard. And. Know no harm may befall a human. And if we take all of this together that’s the accident sequence. Now it’s worth reiterating that just because a hazard exists, it does not necessarily lead to harm. But to get to harm, we must have a hazard; a hazard is both necessary and sufficient. To lead to harmful consequences. OK.

Hazards: an Example

And you can think of a hazard as an accident waiting to happen. You can think of it in lots of different ways, let’s think about an example, the hazard might be. Somebody slips. Okay well while walking and all. That slip might be caused by many things it might be a wet surface. Let’s say it’s been raining, and the pavement is slippery, or it might be icy. It might be a spillage of oil on a surface, or you’d imagine something slippery like ball bearings on a surface.

So, there’s something that’s caused the surface to become slippery. A person slips – that’s the hazard. Now the person may catch themselves; they may not fall over. They may suffer no injury at all. Or they might fall and suffer a slight injury; and, very occasionally, they might suffer a severe injury. It depends on many different factors. You can imagine if you slipped while going downstairs, you’re much more likely to be injured.

And younger, healthy, fit people are more likely to get over a fall without being injured, whereas if they’re very elderly and frail, a fall can quite often result in a broken bone. If an elderly person breaks a bone in a fall the chances of them dying within the next 12 months are quite high. They’re about one in three.

So, the level of risk is sensitive to a lot of different factors. To get an accurate picture, an accurate estimate of risk, we’re going to need to factor in all those things. But before we get to that, we’ve already said that hazards need not lead to harm. In this standard, we call it an incident, where a hazard has occurred; it could have progressed to an accident but didn’t, we call this an incident. A near miss.

We got away with it. We were lucky. Whatever you want to call it. We’ve had an incident but no he’s been hurt. Hopefully, that incident is being reported, which will help us to prevent an actual accident in the future.  That’s another very useful concept that reminds us that not all hazards result in harm. Sometimes there will be no accident. There will be no harm simply because we were lucky, or because someone present took some action to prevent harm to themselves or others.

Mitigation Strategies (Controls)

But we would really like to deliberately design out or avoid Hazards if we can. What we need is a mitigation strategy, we need a measure or measures that, when we put them into practice, reduce that risk. Normally, we call these things controls. Again, now we’ve illustrated this; we’ve added to the funnels. We’ve added some mitigation strategies and they are the dark blue dashed lines.

And they are meant to represent Barriers that prevent the accident sequence from progressing towards harm. And they have dashed lines because very few controls are perfect, you know everything’s got holes in it. And we might have several of them. But usually, no control will cover all possible causes, and very few controls will deal with all possible consequences.  That’s what those barriers are meant to illustrate.

That idea that picture will be very useful to us later. When we are thinking about how we’re going to estimate and evaluate risk overall and what risk reduction we have achieved. And how we talk about justifying what we’ve done is good. That’s a very powerful illustration. Well, let’s move on to safety requirements.

Safety Requirements

Now. I guess it’s no great surprise to say that requirements, once met, can contribute directly to the safety of the system. Maybe we’ve got a safety requirement that says all cars will be fitted with seatbelts. Let’s say we’ll be required to wear a seatbelt.  That makes the system safer.

Or the requirement might be saying we need to provide evidence of the safety of the system. And, the requirement might refer to a process that we’ve got to go through or a set kind of evidence that we’ve got to provide. Safety requirements can cover either or both of these.

The Essence of System Safety

Requirements. Covering. Safety of the system or demonstrating that the system is safe. Should give us assurance, which is adequate confidence or justified confidence. Supported with evidence by following a process. And we’ll talk more about the process. We meet safety requirements. We get assurance that we’ve done the right thing. And this really brings us to the essence of what system safety is, we’ve got all these requirements – everything is a requirement really – including the requirement. To demonstrate risk reduction.

And those requirements may apply to the system itself, the product. Or they may provide, or they may apply to the process that generates the evidence or the evidence. Putting all those things together in an organized and orderly way really is the essence of system safety, this is where we are addressing safety in a systematic way, in an orderly way. In an organized way. (Those words will keep coming back). That’s the essence of system safety, as opposed to the day-to-day task of keeping a workplace safe.

Maybe by mopping up spills and providing handrails, so people don’t slip over. Things like that. We’re talking about a more sophisticated level of safety. Because we have a more complex problem a more challenging problem to deal with. That’s system safety. We will start on the process now, and we begin with hazard identification and analysis; first, we need to identify and list the hazards, the Hazards and accidents associated with the system.

We’ve got a system, physical or not. What could go wrong? We need to think about all the possibilities. And then having identified some hazards we need to start doing some analysis, we follow a process. That helps us to delve into the detail of those hazards and accidents. And to define and understand the accident sequences that could result. In fact, in doing the analysis we will very often identify some more hazards that we hadn’t thought of before, it’s not a straight-through process it tends to be an iterative process.

Risk Reduction

And ultimately what we’re trying to do is reduce risk, we want a systematic process, which is what we’re describing now. A systematic process of reducing risk. And at some point, we must estimate the risk that we’re left with. Before and after all these controls, these mitigations, are applied. That’s risk estimation.  Again, there’s that systematic word, we’re going to use all the available information to estimate the level of risk that we’ve got left. Recalling that risk is a combination of severity and likelihood.

Now as we get towards the end of the process, we need to evaluate risk against set criteria. And those criteria vary depending on which country you’re operating in or which industry we’re in: what regulations apply and what good practice is relevant. All those things can be a factor. Now, in this case, this is a U.K. standard, so we’ve got two tests for evaluating risk. It’s a systematic determination using all the available evidence. And it should be an objective evaluation as far as we can make it.

Risk Evaluation

We should use certain criteria on whether a risk can be accepted or not. And in the U.K. there are two tests for this. As we’ve said before, there is ALARP, the ‘As Low As is Reasonably Practicable’ test, which says: Have we put into practice all reasonably practicable controls? (To reduce risk, this is a risk reduction target). And then there’s an absolute level of risk to consider as well. Because even if we’ve taken all practical measures, the risk remaining might still be so high as to be unacceptable to the law.

Now that test is specific to the U.K, so we don’t have to worry too much about it. The point is there are objective criteria, which we must test ourselves or measure ourselves against. An evaluation that will pop out the decision, as to whether a further risk reduction is necessary if the risk level is still too high. We might conclude that are still reasonably practicable measures that we could take. Then we’ve got to do it.

We have an objective decision-making process to say: have we done enough to reduce risk? And if not, we need to do some more until we get to the point where we can apply the test again and say yes, we’ve done enough. Right, that’s rather a long-winded way of explaining that. I apologize, but it is a key issue and it does trip up a lot of people.

Risk Acceptance

Now, once we’ve concluded that we’ve done enough to reduce risk and no further risk reduction is necessary, somebody should be in a position to accept that risk.  Again, it’s a systematic process, by which relevant stakeholders agree that risks may be accepted. In other words, somebody with the right authority has said yes, we’re going to go ahead with the system and put it into practice, implement it. The resulting risks to people are acceptable, providing we apply the controls.

And we accept that responsibility.  Those people who are signing off on those risks are exposing themselves and/or other people to risk. Usually, they are employees, but sometimes members of the public as well, or customers. If you’re going to put customers in an airliner you’re saying yes there is a level of risk to passengers, but that the regulator, or whoever, has deemed [the risk] to be acceptable. It’s a formal process to get those risks accepted and say yes, we can proceed. But again, that varies greatly between different countries, between different industries. Depending on what regulations and laws and practices apply. (We’ll talk about different applications in another section.)

Risk Management

Now putting all this together we call this risk management.  Again, that wonderful systematic word: a systematic application of policies, procedures, and practices to these tasks. We have hazard identification, analysis, risk estimation, risk evaluation, risk reduction & risk acceptance. It’s helpful to demonstrate that we’ve got a process here, where we go through these things in order. Now, this is a simplified picture because it kind of implies that you just go through the process once.

With a complex system, you go through the process at least once. We may identify further hazards when we get into Hazard Analysis and estimating risk. In the process of trying to do those things, even as late as applying controls and getting to risk acceptance. We may discover that we need to do additional work. We may try and apply controls and discover the controls that we thought were going to be effective are not effective.

Our evaluation of the level of risk and its acceptability is wrong because it was based on the premise that controls would be effective, and we’ve discovered that they’re not, so we must go back and redo some work. Maybe as we go through, we even discover Hazards that we hadn’t anticipated before. This can and does happen, it’s not necessarily a straight-through process. We can iterate through this process. Perhaps several times, while we are moving forward.

Safety Management

OK, Safety Management. We’ve gone to a higher level really than risk because we’re thinking about requirements as well as risk. We’re going to apply organization, we’re going to apply management principles to achieve safety with high confidence. For the first time, we’ve introduced this idea of confidence in what we’re doing. Well, I say the first time, this is insurance isn’t it? Assurance, having justified confidence, or appropriate confidence because we’ve got the evidence. And that might be product evidence too we might have tested the product to show that it’s safe.

We might have analyzed it. We might have said well we’ve shown that we follow the process that gives us confidence that our evidence is good. And we’ve done all the right things and identified all the risks.  That’s safety management. We need to put that in a safety management system, we’ve got a defined organizational structure, and we have defined processes, procedures, and methods. That gives us direction and control of all the activities that we need to put together in combination to effectively meet safety requirements and safety policy.

And our safety tests, whatever they might be. More and more now we’re thinking about top-level organization and planning to achieve the outcomes we need. With a complex system, a complex operating environment, and a complex application.

Safety Planning

Now I’ll just mention planning. Okay, we need a safety management plan that defines the strategy: how we’re going to get there, how are we going to address safety. We need to document that safety management system for a specific project. Planning is very important for effective safety. Safety is very vulnerable to poor planning. If a project is badly planned or not planned at all, it becomes very difficult to Do safety effectively, because we are dependent on the process, on following a rigorous process to give us confidence that all results are correct.  If you’ve got a project that is a bit haphazard, that’s not going to help you achieve the objectives.

Planning is important. Now the bit of that safety plan that deals with timescales, milestones, and other date-related information. We might refer to it as a safety program. Now being a UK Definition, British English has two spellings of program. The double-m-e-version of programme. Applies to that time-based progression, or milestone-based progression.

Whereas in the US and in Australia, for example, we don’t have those two words we just have the one word, ‘program’. Which Covers everything: computer programs, a program of work that might have nothing to do with or might not be determined by timescales or milestones. Or one that is. But the point is that certain things may have to happen at certain points in time or before certain milestones. We may need to demonstrate safety before we are allowed to proceed to tests and trials or before we are allowed to put our system into service.

Demonstrating Safety

We’ve got to demonstrate that Safety has been achieved before we expose people to risk.  That’s very simple. Now, finally, we’re almost at the end. Now we need to provide a demonstration – maybe to a regulator, maybe to customers – that we have achieved safety.  This standard uses the concept of a safety case. The safety case is basically, imagine a portfolio full of evidence.  We’ve got a structured argument to put it all together. We’ve got a body of evidence that supports the argument.

It provides a Compelling, Comprehensible (or understandable), and valid case that a system is safe. For a given application or use, in a given Operating environment.  Really, that definition of what a safety case is, harks back to that meaning of safety.  We’ve got something that really hits the nail on the head. And we might put all of that together and summarise it in a safety case report. That summarises those arguments and evidence, and documents progress against the Safe program.

Remember I said our planning was important. We started off by saying that we need to do this, that the other in order to achieve safety. Hopefully, in the end, in the safety report, we’ll be able to state that we’ve done exactly that. We did do all those things. We did follow the process rigorously. We’ve got good results. We’ve got a robust safety argument. With evidence to support it. In the end, it’s all written up in a report.

Documenting Safety

Now that isn’t always going to be called a safety case report; it might be called a safety assessment report or a design justification report. There are lots of names for these things. But they all tend to do the same kind of thing, where they pull together the argument as to why the system is safe. The evidence to support the argument, document progress against a plan or some set of process requirements from a standard or a regulator or just good practice in industry to say: Yes, we’ve done what we were expected to do.

The result is usually that’s what justifies [the system] getting past that milestone. Where the system is going into service and can be used. People can be exposed to those risks, but safely and under control.

Everyone’s a winner, as they say!

Copyright – Creative Commons Licence

Okay. I’ve used a lot of information from a UK government website. I’ve done that in accordance with the terms of its creative commons license, and you can see more about that here. We have complied with that, as we are required to, and to say to you that the information we’ve supplied is under the terms of this license.

Safety Concepts Part 2: More Resources

And for more resources and for more lessons on system safety. And other safe topics. I invite you to visit the safety artisan.com website  Thanks very much for watching. I hope you found that useful.

We’ve covered a lot of information there, but hopefully in a structured way. We’ve repeated the key concepts and you can see that in that standard. The key concepts are consistently defined, and they reinforce each other. In order to get that systematic, disciplined approach to safety, that’s what we need.

Anyway, that’s enough for me. I hope you enjoyed watching it and found that useful. I look forward to talking to you again soon. Please send me some feedback about what you thought about this video and also what you would like to see covered in the future.

Thank you for visiting The Safety Artisan. I look forward to talking to you again soon. Goodbye.

Safety Concepts Part 1 defines the meaning of ‘Safe’, and it is free. Return to the Start Here Page.

Categories
Lesson System Safety

Safety Assessment Techniques Overview

In Safety Assessment Techniques Overview we will look at how different analysis techniques can be woven together. How does one analysis feed into another? What do we need to get sufficient coverage to be confident that we’ve done enough?

Learning Objectives: Safety Assessment Techniques Overview

You will be able to:

  • List and ‘sequence’ the five types of risk analysis;
  • Describe how the types fit together as a whole;
  • Describe the benefits of each type of analysis;
  • Describe an example of each type of analysis;
  • Select analyses to meet your needs;
  • Design an analysis program for different applications; and
  • Understand issues driving the use of techniques and level of effort.

This is the ten-minute demo version of the full, 70-minute video.

Topics: Safety Assessment Techniques Overview

  • Overview of Sequence;
  • Hazard Identification;
  • Requirements Analysis;
  • Cause Analysis;
  • Consequence Analysis; and
  • Control Effectiveness Analysis.

Transcript: Safety Assessment Techniques Overview

Click Here to See the Transcript

Welcome to The Safety Artisan

I’m Simon, your host. And today we’ve got, quite a special subject.

I’m going to be talking about safety analysis techniques, and this is a special subject because it’s by special request from my friends at the University of Southern California. Thank you to them. And what we’re going to be doing in today’s session is an overview of these different techniques, their benefits and the options that you have for applying techniques in order to come up with a whole programme of analysis.

Let’s explain what I mean

What we’re going to get out of today is after this you will be able to list and sequence the five types of risk analysis, and it says sequence in inverted commas because, as we’ll see, it’s not quite as simple as just going through it once in sequence, and that’s it. We tend to reiterate, but anyway, there is a natural sequence to this stuff, and we’ll see what that is.

Secondly, you’ll be able to describe how these different types of analyses fit together and how they feed each other and complement each other. That’s very important. If we’re going to come up with a reasonable whole; we’re going to describe the benefits of each type of analysis.

I will provide at least one example of each type of analysis, sometimes more than one.

We’re going to talk about how you would select analyses to meet your needs when analysing a specific system. Because we don’t always need to do everything. We don’t always need to throw everything at the problem. some systems are simpler than others, and they don’t need, the whole works in order to get a decent result.

With that in mind, we’re going to be able to design an analysis programme for different applications or for different systems.

And finally, we’re going to understand the issues that drive the use of techniques and the level of effort. The level of rigour that we need to apply now, to set expectations. There’s no magic answer here. I can’t tell you that the amount of hours that you have to spend on a problem is X squared, plus whatever.

We can talk about the factors that drive it, but I cannot give you a nice cut and dried answer. It just doesn’t work like that.

Those were the learning objectives

What we’re going to talk about, we’re going to give an overview of the sequence and then I’m going to recap that at the end.

And then the five types of analyses we’re going to talk about in order hazard identification requirements, analysis, cause analysis or cause or analysis, consequence analysis and control, effectiveness, analysis or control, identification and effectiveness analysis.

I’m going to talk about a couple of other things during that, which will help us pull things together. But those are the five main types that I’m going to talk about. Those are the five types of analysis that I said you would be able to list.  We’ve covered one learning objective already.

I promised you we were going to look at the overview of the sequence.

And I think this is what pulls it all together and explains it powerfully. So the background to this is we’ve got, an accident or mishap sequence. Whatever you want to call it and we start with causes on the left and causes lead two a hazard, and then a has it can lead to multiple consequences.

That is what the bowtie here is representing. It’s showing that multiple causes can lead to a single hazard, and a single hazard can lead to multiple consequences.

Don’t worry too much about the bow tie. I’m not pushing that in particular, it’s a useful technique, but it’s not the only one. We’ll come onto that – that’s the background.

This is the accident sequence we’re trying to discover and understand.

I’m going to talk a lot about discovery and understanding

Yeah, typically, we will start with trying to identify hazards. There are techniques out there that will help us identify hazards associated with the system being used in a specific application, or purpose, in a specific operating environment.

Always bear in mind those three questions about the context, that help us to do this.

What’s the system? What are we using it for? and in what environment?

And if we change any of those things, then probably the hazards will change. But we start off to preliminary hazard identification, which is intended to identify hazards. Big, big arrow pointing at hazards, but also, inevitably, it will identify causes and consequences as well, because it’s not always clear. What is the hazard when you start? talking of discovery, we’re going to discover some stuff.

We may finally classify what we’re talking about later. we’re trying to discover hazards. In reality, we’re going to discover lots of stuff, but mainly we hope hazards, that’s stage one.

Now, then we’re actually going to step outside of the accident sequence itself. We’re going to do some requirements analysis, and the requirements analysis has to come after the PHIA because some safety requirements are driven by the presence of certain hazards.

If you’ve got a noise hazard somebody’s hearing might be affected, then regulations in multiple countries are going to require you to do certain things to monitor the noise. Let’s say or monitor the effect that it’s having on workers and put in place a program to handle that. The presence of certain hazards will drive certain requirements for safety controls or risk controls.

Then there are the broader requirements. Analysis of what does the law require, what the regulations require, codes of practise, etc. We’ll get onto that, and one of the things that requirements analysis is going to do is give us an initial stab of what we’ve got to have – certain controls because we’re required to. That’s a little bit of an aside in terms of the sequence, but it’s very, very important.

Thirdly, and, fourthly, once we’ve discovered some hazards, we’re going to need to understand what might cause those hazards and therefore how likely is the hazard to exist in particular circumstances, and then also think about the consequences that might arise from a hazard. And once we’ve explored those, we will be in a position to actually capture the risk.

 Because we will have some view on likelihood. And we would also have some view on the severity of consequences from considering the consequences. We’ll come onto that later.

Finally, having done all those other things, we will be in a position to take a much more systematic look at controls and say, we’ve got these causes. We’ve got these hazards. We’ve got these potential consequences.  What do I need to do to control this risk and prevent this accident sequence from playing out?

What I need to put in place to interrupt the accident sequence, and I’ve put the controls. The dashed lines indicate that we’ve got barriers to that accident sequence, and they are dashed because no control is perfect. (Other than gravity. But of course, if you turn your vehicle upside down, then gravity is working against you, so even gravity isn’t foolproof.)

No control is 100% effective

We need to just accept that and deal with that and understand. There is your overview of the sequence, and I’ve spent a bit of time talking about that because it is absolutely fundamental to everything you’re going to do.

But let’s move on and start to look at some of these individual types of techniques.

Which Safety Techniques do You Use? Leave a Comment below…

Categories
Course System Safety

The Safety Artisan is on Thinkific

I’m pleased to tell you that The Safety Artisan is on Thinkific!

Thinkific is a powerful and beautifully-presented online Learning Management System.  This will complement the existing Safety Artisan website.  

My first course will be ‘System Safety Assessment‘ with ten hours of instructional videos. The new course is here.

(Please note that this is the same course as my ‘Complete System Safety Analysis Bundle’ of 12 videos available here.  So, if you’ve already bought that – thanks very much – please don’t buy it again, as you already have all the material.)

What will the System Safety Assessment Course do for you?

Transcript of the Video

Read the Transcript Here:

Welcome to the System Safety Assessment course

In this course, you will gain knowledge, skills, and confidence.  You will gain knowledge of what is involved in system safety assessment.  The individual tasks and techniques you need to carry out.

But more importantly, how to put them together into a successful program and how to tailor all these different tasks keeping some, but leaving out others so that you get an efficient and effective safety program, no matter what application or what system you are working with.

So that’s the knowledge and the skills

You’ll also get the confidence to be able to get you started.  Now, there is no substitute for live face-to-face training and coaching.  But this format is much more accessible to you and much more reasonably priced.  So wherever you are in the world, whatever time and day you want to do your learning, you can access this course and you can gain confidence to get you started.

So if you’re worried about a job interview, what you’re going to say or you’re worried about how to do a job and there’s nobody around to help you.  Then this course will give you the confidence to get started and to be aware of the pitfalls before you begin.

So what makes me confident that I can help you?

Well, first of all, I’ve got 25 years of experience applying system safety.

And I’ve done that in the UK, in the United States, in Australia, and in the European Union.  I’ve seen a wide variety of legal jurisdictions that I’ve worked in.  Also, I’ve worked on a wide variety of systems.  I’ve worked on planes, trains, ships and submarines, software, and I.T. systems all kinds of stuff.

I’ve worked on some gigantic multibillion-dollar projects and some much smaller ones.  So I know how to pragmatically apply this stuff, at a reasonable scale without spending stupid amounts of money.

And in fact, as part of my job as a consultant, I spent half the time telling clients to do less and spend less and still get an effective result.  So that’s where I’m coming from.

I’ve also got experience teaching system safety in the classroom.  I’ve taught hundreds of students, from various different projects.  And now I have hundreds of online students, and I’m very pleased to be able to help all of those as well.

So that’s why I think that I can help you

And I hope that you will enjoy this course and get a lot out of it.  Thanks very much for considering The Safety Artisan.

What do you think of the new page?

Categories
Mil-Std-882E System Safety

Learn How to Perform System Safety Analysis

In this ‘super post,’ you’re going to Learn How to Perform System Safety Analysis. I’m going to point you to thirteen lessons that explain each of the ten analysis tasks, the analysis process, and how to combine the tasks into a program!

Follow the links to sample and buy lessons on individual tasks. You can get discount deals on a bundle of three tasks, or all twelve.

Discount Offer

BLACK FRIDAY SALE – GET THE BUNDLE FOR $468 $234!

Click here for 50% off on all thirteen lessons:

  • Safety Assessment Techniques Overview;
  • System Safety Process;
  • Design your System Safety Program; and
  • All ten System Safety Analysis tasks.

Introduction

Military Standard 882, or Mil-Std-882 for short, is one of the most widely used system-safety standards. As the name implies, this standard is used on US military systems, but it has found its way, sometimes in disguise, into many other programs around the world. It’s been around for a long time and is now in its fifth incarnation: 882E.

Unfortunately, 882 has also been widely misunderstood and misapplied. This is probably not the fault of the standard and is just another facet of its popularity. The truth is that any standard can be applied blindly – no standard is a substitute for competent decision-making.

In this series of posts, we will: provide awareness of this standard; explain how to use it; and discuss how to manage, tailor, and implement it. Links to each training session and to each section of the standard are provided in the following sections.

Mil-Std-882E Training Sessions

System Safety Process, here

Photo by Bonneval Sebastien on Unsplash

In this full-length (50 minutes) video, you will learn to:

  • Know the system safety process according to Mil-Std-882E;
  • List and order the eight elements;
  • Understand how they are applied;
  • Skilfully apply system safety using realistic processes; and
  • Feel more confident dealing with multiple standards.

In System Safety Process, we look a the general requirements of Mil-Std-882E. We cover the Applicability of the 882E tasks; the General requirements; the Process with eight elements; and the application of process theory to the real world.

Design Your System Safety Analysis Program

Photo by Christina Morillo from Pexels

Learn how to Design a System Safety Program for any system in any application.

Learning Objectives. At the end of this course, you will be able to:

  • Define what a risk analysis program is;
  • List the hazard analysis tasks that make up a program;
  • Select tasks to meet your needs; and
  • Design a tailored risk analysis program for any application.

This lesson is available as part of the twelve-lesson bundle (see the bottom of this post) or you can get it as part of my ‘SSRAP’ course at Udemy here.

Analysis: 200-series Tasks

Preliminary Hazard Identification, Task 201

Identify Hazards.

In this video, we find out how to create a Preliminary Hazard List, the first step in safety assessment. We look at three classic complementary techniques to identify hazards and their pros and cons. This includes all the content from Task 201, and also practical insights from my 25 years of experience with Mil-Std-882.

Preliminary Hazard Analysis, Task 202

See More Clearly.

In this 45-minute session, The Safety Artisan looks at Preliminary Hazard Analysis, or PHA, which is Task 202 in Mil-Std-882E. We explore Task 202’s aim, description, scope, and contracting requirements. We also provide value-adding commentary and explain the issues with PHA – how to do it well and avoid the pitfalls.

System Requirements Hazard Analysis, Task 203

Law, Regulations, Codes of Practice, Guidance, Standards & Recognised Good Practice.

In this 45-minute session, The Safety Artisan looks at Safety Requirements Hazard Analysis, or SRHA, which is Task 203 in the Mil-Std-882E standard. We explore Task 203’s aim, description, scope, and contracting requirements. SRHA is an important and complex task, which needs to be done on several levels to be successful. This video explains the issues and discusses how to perform SRHA well.

Triple Bundle Offer

Click here for a half-price deal on the three essential tasks: Preliminary Hazard Identification, Preliminary Hazard Analysis, and Safety Requirements Hazard Analysis.

Sub-system Hazard Analysis, Task 204

Breaking it down to the constituent parts.

In this video lesson, The Safety Artisan looks at Sub-System Hazard Analysis, or SSHA, which is Task 204 in Mil-Std-882E. We explore Task 204’s aim, description, scope, and contracting requirements. We also provide value-adding commentary and explain the issues with SSHA – how to do it well and avoid the pitfalls.

System Hazard Analysis, Task 205

Putting the pieces of the puzzle together.

In this 45-minute session, The Safety Artisan looks at System Hazard Analysis, or SHA, which is Task 205 in Mil-Std-882E. We explore Task 205’s aim, description, scope, and contracting requirements. We also provide value-adding commentary, which explains SHA – how to use it to complement Sub-System Hazard Analysis (SSHA, Task 204) in order to get the maximum benefits for your System Safety Program.

Operating and Support Hazard Analysis, Task 206

Operate it, maintain it, supply it, dispose of it.

In this full-length session, The Safety Artisan looks at Operating & Support Hazard Analysis, or O&SHA, which is Task 206 in Mil-Std-882E. We explore Task 205’s aim, description, scope, and contracting requirements. We also provide value-adding commentary, which explains O&SHA: how to use it with other tasks; how to apply it effectively on different products; and some of the pitfalls to avoid. We refer to other lessons for specific tools and techniques, such as Human Factors analysis methods.

Health Hazard Analysis, Task 207

Hazards to human health are many and various.

In this full-length (55-minute) session, The Safety Artisan looks at Health Hazard Analysis, or HHA, which is Task 207 in Mil-Std-882E. We explore the aim, description, and contracting requirements of this complex Task, which covers: physical, chemical & biological hazards; Hazardous Materials (HAZMAT); ergonomics, aka Human Factors; the Operational Environment; and non/ionizing radiation. We outline how to implement Task 207 in compliance with Australian WHS. 

Functional Hazard Analysis, Task 208

Components where systemic failure dominates random failure.

In this full-length (40-minute) session, The Safety Artisan looks at Functional Hazard Analysis, or FHA, which is Task 208 in Mil-Std-882E. FHA analyses software, complex electronic hardware, and human interactions. We explore the aim, description, and contracting requirements of this Task, and provide extensive commentary on it. 

System-Of-Systems Hazard Analysis, Task 209

Existing systems are often combined to create a new capability.

In this full-length (38-minute) session, The Safety Artisan looks at Systems-of-Systems Hazard Analysis, or SoSHA, which is Task 209 in Mil-Std-882E. SoSHA analyses collections of systems, which are often put together to create a new capability, which is enabled by human brokering between the different systems. We explore the aim, description, and contracting requirements of this Task, and an extended example to illustrate SoSHA. (We refer to other lessons for special techniques for Human Factors analysis.)

Environmental Hazard Analysis, Task 210

Environmental requirements in the USA, UK, and Australia.

This is the full (one hour) session on Environmental Hazard Analysis (EHA), which is Task 210 in Mil-Std-882E. We explore the aim, task description, and contracting requirements of this Task, but this is only half the video. We then look at environmental requirements in the USA, UK, and Australia, before examining how to apply EHA in detail under the Australian/international regime. This uses my practical experience of applying EHA. 

Discount Offer

Click here for a bumper deal on all twelve lessons:

  • System Safety Process;
  • Design your System Safety Program; and
  • All ten System Safety Analysis tasks.
Categories
System Safety

FAQ on System Safety

In this FAQ on System Safety, I share some lessons that will explain the basics right through to more advanced topics!

The system safety concept calls for a risk management strategy based on identification, analysis of hazards and application of remedial controls using a systems-based approach.

Harold E. Roland; Brian Moriarty (1990). System Safety Engineering and Management.

In ‘Safety Concepts Part 1’, we look at the meaning of the term “safe”. This fundamental topic provides the foundation for all other safety topics, and it’s simple!

In this 45-minute free video, I discuss System Safety Principles, as set out by the US Federal Aviation Authority in their System Safety Handbook

In System Safety Programs, we learn how to Design a System Safety Program for any system in any application.

The Common System Safety Questions

To see them click here:

is system safety, system safety is, what’s system safety, what is system safety management, what is system safety assessment, what is a system safety program plan, what is safety system of work, [what is safe system of work], what’s system safety, which active safety system, why system safety, system safety faa, system safety management, system safety management plan, system safety mil std, system safety methodology, system safety mil-std-882d, system safety mil-std-882e, system safety program plan, system safety process, system safety ppt system safety principles, system safety perspective, system safety precedence, system safety analysis, system safety analysis handbook, system safety analysis techniques, system safety courses, system safety assessment.

System safety is a specialty within system engineering that supports program risk management. … The goal of System Safety is to optimize safety by the identification of safety related risks, eliminating or controlling them by design and/or procedures, based on acceptable system safety precedence.

FAA System Safety Handbook, Chapter 3: Principles of System Safety
December 30, 2000

If you don’t find what you want in this FAQ on Risk Management, there are plenty more lessons under Start Here and System Safety Analysis topics. Or just enter ‘system safety’ into the search function at the bottom of any page.

Categories
System Safety

Reflections on a Career in Safety, Part 4

In ‘Reflections on a Career in Safety, Part 4’, I want to talk about Consultancy, which is mostly what I’ve been doing for the last 20 years!

Consultancy

As I said near the beginning, I thought that in the software supportability team, we all wore the same uniform as our customers. We didn’t cost them anything. We were free. We could turn up and do a job. You would think that would be an easy sell, wouldn’t you?

Not a bit of it.  People want there to be an exchange of tokens. If we’re talking about psychology, if something doesn’t cost them anything, they think, well, it can’t be worth anything. So [how much] we pay for something really does affect our perception of whether it’s any good.

Photo by Cytonn Photography on Unsplash

So I had to go and learn a lot of sales and marketing type stuff in order to sell the benefits of bringing us in, because, of course, there was always an overhead of bringing new people into a program, particularly if they were going to start asking awkward questions, like how are we going to support this in service? How are we going to fix this? How is this going to work?

So I had to learn a whole new language and a whole new way of doing business and going out to customers and saying, we can help you, we can help you get a better result. Let’s do this. So that was something new to learn. We certainly didn’t talk about that at university.  Maybe you do more business focussed stuff these days. You can go and do a module, I don’t know, in management or whatever; very, very useful stuff, actually. It’s always good to be able to articulate the benefits of doing something because you’ve got to convince people to pay for it and make room for it.

Doing Too Little, or Too Much

And in safety, I’ve got two [kinds of] jobs.

First of all, I suppose it’s the obvious one. Sometimes you go and see a client, they’re not aware of what the law says they’re supposed to do or they’re not aware that there’s a standard or a regulation that says they’ve got to do something – so they’re not doing it. Maybe I go along and say, ah, look, you’ve got to do this. It’s the law. This is what we need to do.

Photo by Quino Al on Unsplash

Then, there’s a negotiation because the customer says, oh, you consultants, you’re just making up work so you can make more money. So you’ve got to be able to show people that there’s a benefit, even if it’s only not going to jail. There’s got to be a benefit. So you help the clients to do more in order to achieve success.

You Need to Do Less!

But actually, I spend just as much time advising clients to do less, because I see lots of clients doing things that appear good and sensible. Yes, they’re done with all the right motivation. But you look at what they’re doing and you say, well, this you’re spending all this money and time, but it’s not actually making a difference to the safety of the product or the process or whatever it is.

You’re chucking money away really, for very little or no effect.  Sometimes people are doing work that actually obscures safety. They dive into all this detail and go, well, actually, you’ve created all this data that’s got to be managed and that’s actually distracting you from this thing over here, which is the thing that’s really going to hurt people.

So, [often] I spend my time helping people to focus on what’s important and dump the comfort blanket, OK, because lots of times people are doing stuff because they’ve always done it that way, or it feels comforting to do something. And it’s really quite threatening to them to say, well, actually, you think you’re doing yourself a favor here, but it doesn’t actually work. And that’s quite a tough sell as well, getting people to do less.

Photo by Prateek Katyal on Unsplash

However, sometimes less is definitely more in terms of getting results.

Part 5 will follow next week!

New to System Safety? Then start here. There’s more about The Safety Artisan here. Subscribe for free regular emails here.

Categories
System Safety

Reflections on a Career in Safety, Part 3

In ‘Reflections on a Career in Safety, Part 3’ I continue talking about different kinds of Safety, moving onto…

Projects and Products

Then moving on to the project side, where teams of people were making sure a new aeroplane, a new radio, a new whatever it might be, was going to work in service; people were going to be able to use it, easily, support it, get it replaced or repaired if they had to. So it was a much more technical job – so lots of software, lots of people, lots of process and more people.

Moving to the software team was a big shock to me. It was accidental. It wasn’t a career move that I had chosen, but I enjoyed it when I got there.  For everything else in the Air Force, there was a rule. There was a process for doing this. There were rules for doing that. Everything was nailed down. When I went to the software team, I discovered there are no rules in software, there are only opinions.

The ‘H’ is software development is for ‘Happiness’

So straight away, it became a very people-focused job because if you didn’t know what you were doing, then you were a bit stuck.  I had to go through a learning curve, along with every other technician who was on the team. And the thing about software with it being intangible is that it becomes all about the process. If a physical piece of kit like the display screen isn’t working, it’s pretty obvious. It’s black, it’s blank, nothing is happening. It’s not always obvious that you’ve done something wrong with software when you’re developing it.

So we were very heavily reliant on process; again, people have got to decide what’s the right process for this job? What are we going to do? Who’s going to do it? Who’s able to do it? And it was interesting to suddenly move into this world where there were no rules and where there were some prima donnas.

Photo by Sandy Millar on Unsplash

We had a handful of really good programmers who could do just about anything with the aeroplane, and you had to make the best use of them without letting them get out of control.  Equally, you had people on the other end of the scale who’d been posted into the software team, who really did not want to be there. They wanted to get their hands dirty, fixing aeroplanes. That’s what they wanted to do. Interesting times.

From the software team, I moved on to big projects like Eurofighter, that’s when I got introduced to:

Systems Engineering

And I have no problem with plugging systems engineering because as a safety engineer, I know [that] if there is good systems engineering and good project management, I know my job is going to be so much easier. I’ve turned up on a number of projects as a consultant or whatever, and I say, OK, where’s the safety plan? And they say, oh, we want you to write it. OK, yeah, I can do that. Whereas the project management plan or where’s the systems engineering management plan?

If there isn’t one or it’s garbage – as it sometimes is – I’m sat there going, OK, my just my job just got ten times harder, because safety is an emergent property. So you can say a piece of kit is on or off. You can say it’s reliable, but you can’t tell whether it’s safe until you understand the context. What are you asking it to do in what environment? So unless you have something to give you that wider and bigger picture and put some discipline on the complexity, it’s very hard to get a good result.

Photo by Sam Moqadam on Unsplash

So systems engineering is absolutely key, and I’m always glad to work with the good systems engineer and all the artifacts that they’ve produced. That’s very important. So clarity in your documentation is very helpful. Being [involved], if you’re lucky, at the very beginning of a program, you’ve got an opportunity to design safety, and all the other qualities you want, into your product. You’ve got an opportunity to design in that stuff from the beginning and make sure it’s there, right there in the requirements.

Also, systems engineers doing the requirements, working out what needs to be done, what you need the product to do, and just as importantly, what you need it not to do, and then passing that on down the chain. That’s very important. And I put in the title “managing at a distance” because, unlike in the operations world where you can say “that’s broken, can you please go and fix it”.

Managing at a Distance

It’s not as direct as that.  You’re looking at your process, you’re looking at the documentation, you’re working with, again, lots and lots of people, not all of whom have the same motivation that you do.

Photo by Bonneval Sebastien on Unsplash

Industry wants to get paid. They want to do the minimum work to get paid, [in order] to maximize their profit. You want the best product you can get. The pilots want something that punches holes in the sky and looks flash and they don’t really care much about much else, because they’re quite inoculated to risk.

So you’ve got people with competing motivations and everything has got to be worked indirectly. You don’t get to control things directly. You’ve got to try and influence and put good things in place, in almost an act of faith that, [you put] good things in place and good things will result.  A good process will produce a good product. And most of the time that’s true. So (my last slide on work), I ended up doing consultancy, first internally and then externally.

Part 4 will follow next week!

New to System Safety? Then start here. There’s more about The Safety Artisan here. Subscribe for free regular emails here.

Categories
System Safety

Reflections on a Career in Safety, Part 2

In ‘Reflections on a Career in Safety, Part 2’ I move on to …

Different Kinds of Safety

So I’m going to talk a little bit about highlights, that I hope you’ll find useful.  I went straight from university into the Air Force and went from this kind of [academic] environment to heavy metal, basically.  I guess it’s obvious that wherever you are if you’re doing anything in industry, workplace health and safety is important because you can hurt people quite quickly. 

Workplace Health and Safety

In my very first job, we had people doing welding, high voltage electrics, heavy mechanical things; all [the equipment was] built out of centimeter-thick steel. It was tough stuff and people still managed to bend it. So the amount of energy that was rocking around there, you could very easily hurt people.  Even the painters – that sounds like a safe job, doesn’t it? – but aircraft paint at that time a cyanoacrylate. It was a compound of cyanide that we used to paint aeroplanes with.

All the painters and finishers had to wear head-to-toe protective equipment and breathing apparatus. If you’re giving people air to breathe, if you get that wrong, you can hurt people quite quickly. So even managing the hazards of the workplace introduced further hazards that all had to be very carefully controlled.

Photo by Ömer Yıldız on Unsplash

And because you’re in operations, all the decisions about what kind of risks and hazards you’re going to face, they’ve already been made long before.  Decisions that were made years ago, when a new plane or ship or whatever it was, was being bought and being introduced [into service]. Decisions made back then, sometimes without realizing it, meant that we were faced with handling certain hazards and you couldn’t get rid of them. You just had to manage them as best you could.

Overall, I think we did pretty well. Injuries were rare, despite the very exciting things that we were dealing with sometimes.  We didn’t have too many near misses – not that we heard about anyway. Nevertheless, that [risk] was always there in the background. You’re always trying to control these things and stop them from getting out of control.

One of the things about a workplace in operations and support, whether you’re running a fleet of aeroplanes or you’re servicing some kit for somebody else and then returning it to them, it tends to be quite a people-centric job. So, large groups of people doing the job, supervision, organization, all that kind of stuff.  And that can all seem very mundane, a lot of HR-type stuff. But it’s important and it’s got to be dealt with.

So the real world of managing people is a lot of logistics. Making sure that everybody you need is available to do the work, making sure that they’ve got all the kit, all the technical publications that tell them what to do, the information that they need.  It’s very different to university – a lot of seemingly mundane stuff – but it’s got to be got right because the consequences of stuffing up can be quite serious.

Safe Systems of Work

So moving on to some slightly different topics, when I got onto working with Aeroplanes, there was an emphasis on a safe system of work, because doing maintenance on a very complex aeroplane was quite an involved process and it had to be carefully controlled. So we would have what’s usually referred to as a Permit to Work system where you very tightly control what people are allowed to do to any particular plane. It doesn’t matter whether it’s a plane or a big piece of mining equipment or you’re sending people in to do maintenance on infrastructure; whatever it might be, you’ve got to make sure that the power is disconnected before people start pulling it apart, et cetera, et cetera.

Photo by Leon Dewiwje on Unsplash

And then when you put it back together again, you’ve got to make sure that there aren’t any bits leftover and everything works before you hand it back to the operators because they’re going to go and do some crazy stuff with it. You want to make sure that the plane works properly. So there was an awful lot of process in that. And in those days, it was a paperwork process. These days, I guess a lot would be computerized, but it’s still the same process.

If you muck up the process, it doesn’t matter whether [it is paper-based or not].  If you’ve got a rubbish process, you’re going to get rubbish results and it [computerization] doesn’t change that. You just stuff up more quickly because you’ve got a more powerful tool. And for certain things we had to take, I’ve called it special measures. In my case, we were a strike squadron, which meant our planes would carry nuclear weapons if they had to.

Special Processes for Special Risks

So if the Soviets charged across the border with 20,000 tanks and we couldn’t stop them, then it was time to use – we called them buckets of sunshine. Sounds nice, doesn’t it? Anyway, so there were some fairly particular processes and rules for looking after buckets of sunshine. And I’m glad to say we only ever used dummies. But when you when the convoy arrived and yours truly has to sign for the weapon and then the team starts loading it, then that does concentrate your mind as an engineer. I think I was twenty-two, twenty-three at the time.  

Photo by Oscar Ävalos on Unsplash

Somebody on [our Air Force] station stuffed up on the paperwork and got caught. So that was two careers of people my age, who I knew, that were destroyed straight away, just by not being too careful about what they were doing. So, yeah, that does concentrate the mind.  If you’re dealing with, let’s say you’re in a major hazard facility, you’re in a chemical plant where you’ve got perhaps thousands of tonnes of dangerous chemicals, there are some very special risk controls, which you have to make sure are going to work most of the time.

And finally, there is ‘airworthiness’: decisions about whether we could fly an aeroplane, even though some bits of it were not working. So that was a decision that I got to make once I got signed off to do it. But it’s a team job. You talk to the specialists who say, this bit of the aeroplane isn’t working, but it doesn’t matter as long as you don’t do “that”.

Photo by Eric Bruton on Unsplash

So you have to make sure that the pilots knew, OK, this isn’t working.  This is the practical effect from your [operator’s] point of view. So you don’t switch this thing on or rely on this thing working because it isn’t going to work. There were various decisions about [airworthiness] that were an exciting part of the job, which I really enjoyed.  That’s when you had to understand what you were doing, not on your own, because there were people who’d been there a lot longer than me.  But we had to make things work as best we could – that was life.

Part 3 will follow next week!

New to System Safety? Then start here. There’s more about The Safety Artisan here. Subscribe for free regular emails here.