Here is the full transcript: Preliminary Hazard Analysis.
The full video is here.
Preliminary Hazard Analysis
Hello and welcome to the Safety Artisan, where you’ll find professional, pragmatic and impartial safety training resources. So, we’ll get straight on to our session and it is the 8th February 2020.
Now we’re going to talk today about Preliminary Hazard Analysis (PHA). This is Task 202 in Military Standard 882E, which is a system safety engineering standard. It’s very widely used mostly on military equipment, but it does turn up elsewhere. This standard is of wide interest to people and Task 202 is the second of the analysis tasks. It’s one of the first things that you will do on a systems safety program and therefore one of the most informative. This session forms part of a series of lessons that I’m doing on Mil-Std-882E.
Topics for This Session
What are we going to cover in this session? Quite a lot! The purpose of the task, a task description, recording and scope. How we do risk assessments against Tables 1, 2 and 3. Basically, it is severity, likelihood and the overall risk matrix. We will talk about all three, about risk mitigation and using the order of preference for risk mitigation, a little bit of contracting and then a short commentary from myself. In fact, I’m providing commentary all the way through. So, let’s crack on.
Task 202 Purpose
The purpose of Task 202 is to perform and document a preliminary hazard analysis, or PHA for short, to identify hazards, assess the initial risks and identify potential mitigation measures. We’re going to talk about all of that.
First, the task description is quite long here. And as you can see, I’ve highlighted some stuff that I particularly want to talk about.
It says “the contractor” [does this or that], but it doesn’t really matter who is doing the analysis, and actually, the customer needs to do some to inform themselves, otherwise they won’t really understand what they’re doing. Whoever does it needs to perform and document PHA. It’s about determining initial risk assessments. There’s going to be more work, more detailed work done later. But for now, we’re doing an initial risk assessment of identified hazards. And those hazards will be associated with the design or the functions that we’re proposing to introduce. That’s very important. We don’t need a design to do this. We can get in early when we have user requirements, functional requirements, that kind of thing.
Doing this work will help us make better requirements for the system. So, we need to evaluate those hazards for severity and probability. It says based on the best available data. And of course, early in a program, that’s another big issue. We’ll talk about that more later. It says including mishap data as well, if accessible: American term mishap, it means an accident, but we’re avoiding any kind of suggestion about whether it is accidental or deliberate. It might be stupidity, deliberate, whatever. It’s a mishap. It’s an undesirable event. We look for accessible data from similar systems, legacy systems and other lessons learned. I’ve talked about that a little bit in Task 201 lesson about that, and there’s more on that today under commentary. We need to look at provisions, alternatives, meaning design provisions and design alternatives in order to reduce risks and adding mitigation measures to eliminate hazards. If we can all reduce associated risk, we need to include all of that. What’s the task description? That’s a good overview of the task and what we need to talk about.
Reading & Scope
First, recording and scope, as always, with these tasks, we’ve got to document the results of the PHA in a hazard tracking system. Now, a word on terminology; we might call hazard tracking system; we might call it hazard log; we might call it a risk register. It doesn’t really matter what it’s called. The key point is it’s a tracking system. It’s a live document, as people say, it’s a spreadsheet or a database, something like that. It’s something relatively easy to update and change. And, we can track changes through the safety program once we do more analysis because things will change. We should expect to get some results and to refine them and change them as time goes on. Very important point.
Scope. Big section this. Let me just check. Yes, we’ve got three slides on the scope. This does go on and on. The scope of the PHA is to consider the potential contribution from a lot of different areas. We might be considering a whole system or a subsystem, depending on how complex the thing is we’re dealing with. And we’re going to consider mishaps, the accidents and incidents, near misses, whatever might occur from components of the system (a. System components), energy sources (b. Energy sources), ordnance (c. Ordnance)- well that’s bullets and explosives to you and me, rockets and that kind of stuff.
Hazardous materials (d. Hazardous Materials (HAZMAT)), interfaces and controls (3. Interfaces and controls), interface considerations to other systems (f. Interface considerations to other systems when in a network or System-of-Systems (SoS) architecture), external systems. Maybe you’ve got a network of different systems talking to each other. Sometimes that’s called a system of systems architecture. Don’t worry about the definitions. Our system probably interacts and talks to other systems, or It relies on other systems in some way, or other systems rely on it. There are external interfaces. That’s the point.
We might think about material compatibilities (g. Material Compatibilities) – Different materials and chemicals are not compatible with others- inadvertent activation (h. Inadvertent activation).
Now, I’ve highlighted I. (Commercial-Off-the-Shelf (COTS), Government-Off-the-Shelf (GOTS), Non-Developmental Items (NDIs), and Government-Furnished Equipment (GFE).) because it’s something that often gets neglected. We also need to think about stuff that’s already been developed. The general term is NDIs and it might be commercial off the shelf, it might be a government off the shelf system, or government-furnished equipment GFE- doesn’t really matter what it is. These days, especially, very few complex systems are developed purely from scratch. We try and reuse stuff wherever we can in order to keep costs down and schedule down.
We’re going to need to integrate all these things and consider how they contribute to the overall risk picture. And as I say, that’s not often done well. Well, it’s hardly ever done well. It’s often not done at all. But it needs to be, even if only crudely. That’s better than nothing.
J. (j. Software, including software developed by other contractors or sources. Design criteria to control safety-significant software commands and responses (e.g., inadvertent command, failure to command, untimely command or responses, and inappropriate magnitude) shall be identified, and appropriate action shall be taken to incorporate these into the software (and related hardware) specifications) we need to include software, including software developed elsewhere. Again, that’s very difficult, often not done well. Software is intangible. If somebody else has developed it maybe we don’t have the rights to see the design, or code, or anything like that. Effectively it’s a black box to us. We need to look at software. I’m not going to bother going through all the blurb there.
Another big thing in part k (k. Operating environment and constraints) is we need to look at the operating environment. Because a piece of kit that behaves in a certain way in one environment, you put it in a different environment and it behaves differently. And it might become much more dangerous. You never know. And the constraints that we put under on the system. Operating environment is very big. And in fact, if you see the lesson I did on the definition of safety, we can’t really define whether a system is safe or not until we define the operating environment. It’s that important, a big point there.
And then the third slide of three procedures (l. Procedures for operating, test, maintenance, built-in-test, diagnostics, emergencies, explosive ordnance render-safe and emergency disposal). Again, these are well these often don’t appear until later unless of course, we’ve gone off the shelf system. But if we have got off the shelf system; there should be a user manual, there should be maintenance manuals, there should be warnings and cautions, all this kind of stuff. So, we should be looking for procedures for all these things to see what we could learn from them. We want to think about the different modes (m. Modes) of operation of the system. We want to think about health hazards (n. Health hazards) to people, environmental impacts (o. Environmental Impacts), because they take to includes environmental.
We need to think about human factors, human engineering and human error analysis (p. Human factors engineering and human error analysis of operator functions, tasks, and requirements). And it says operator function tasks and requirements, but there’s also maintenance and disposal of storage. All the good stuff. Again, Human Factors is another big issue. Again, it’s not often done well, but actually, if you get a human factor specialist statement early, you can do a lot of good work and save yourself a lot of money, and time, and aggravation by thinking about things early on.
We need to think about life support requirements (q. Life support requirements and safety implications in manned systems, including crash safety, egress, rescue, survival, and salvage). If the system is crewed or staffed in some way, I’m thinking about, well, ‘What happens if it crashes?’ ‘How do we get out?’ ‘How do we rescue people?’ ‘How do we survive?’ ‘How do we salvage the system?’
Event-unique hazards (r. Event-unique hazards). Well, that’s kind of a capsule for your system does something unusual. If it does something unusual you need to think about it.
And then thinking about part s. infrastructure (s. Built infrastructure, real property installed equipment, and support equipment), property installed equipment and support equipment in property and infrastructure.
And then malfunctions (t. Malfunctions of the SoS, system, subsystems, components, or software) of all the above.
I’m just going to whizz back and forth. We’ve got to sub-item T there. We’ve got an awful lot of stuff there to consider. Now, of course, this is kind of a hazard checklist, isn’t it? It’s sort of a checklist of things. We need to look at all that stuff. And in that respect, that’s excellent, and we should aim to do something on all of them just to see if they’re relevant or not if nothing else. The mistake people often make is because they can’t do something perfect and comprehensive, they don’t do anything. We’ve got a lot of things to go through here. And it’s much better to have a go at all these things early and do a bit of rough work in order to learn some stuff about our system. It’s much better to do that than to do nothing at all. And with all of these things, it may be difficult to do some of these things, the software, the COTS, things where we don’t have access to all the information, but it’s better to do a little bit of work early than to do nothing at all waiting for the day to arrive when we’ll be able to do it perfectly with only information. Because guess what? That day never comes! Get in and have a go at everything early, even if it’s only to say, ‘I know nothing about this subject, and we need to investigate it.’ That’s the pros and cons of this approach. Ideally, we need to do all these things, but it can be difficult.
Moving on. Well, we’ve looked to a broad scope of things for all the hazards that we identify and there are various techniques you can use. The PHA has got to include a risk assessment. That means that we’ve got to think about likelihood and severity and then that gives us an overall picture of risk when we combine the two together. That’s tables 1 and 2.
And then, forget risk assessment codes I’m not sure why that’s in there, table 3 is the risk matrix and 88 2 has a standard risk matrix. And it says to use that unless you’ve got a tailored matrix for your system that’s been approved for use. And in this case, it says approved effectively in accordance with the US Department of Defence. But it’s whoever is the acquiring organization, the authority, the customer, the purchaser, whatever you want to call it, the end-user. We’ll talk about that more in a sec.
Table I, Severity
Let’s start by looking at severity, which in many ways is the easiest thing to look at. Now, here we’ve got in this standard we’ve got an approach based on harm to people, harm to the environment, and monetary loss due to smashing stuff up. At the top catastrophic accident. Category 1 is a fatal accident. This accident could result in death, permanent total disability, irreversible significant environmental impact, or monetary loss. And in this case, it says $10 million. Well, this, that’s 10 million US dollars. This standard was created in 2012, this version of the standard, probably inflation has had an effect since then. And a critical accident, we could cause partial disability injuries or occupational illness that can hospitalized three people are reversible. Significant environmental impact or some losses between 1 million and 10. And then we go down to marginal. Injury or hospital, lost workdays for one person, reversible moderate environmental impact or monetary loss between $100,000 and one million dollars. And then finally negligible is less than that. Negligible is an injury or illness that doesn’t result in any lost time at work, minimal environmental impact, or a monetary loss of less than a hundred thousand dollars. That’s easy to do in this standard. We just say, ‘What are the losses that we think could result?’ Worst case, reasonable scenario or an accident? That’s straightforward.
Table II, Probability
Now let’s look at probability. We’ve got a range here from ‘a’ to ‘e’, frequent down to improbable, and then F is eliminated. And eliminated in the standard really does mean eliminated. It cannot happen ever! It does not mean that we managed to massage the figures, the likelihood a probability figures, down Low that we pretend that it will never happen. It means that it is a physical impossibility. Please take note because I’ve seen a lot of abuse of that approach. That’s bad practices to massage the figures down to a level where you say, ’I don’t need to bother thinking about this at all!’ because the temptation is just to frig [massage] the figures and not really consider stuff that needs to be considered. Well, I’ll get off my soapbox now.
Let’s go back to the top. Frequent- you’ve said, for one item, likely to occur often. Down to probable- occur several times in the life of an item. Occasional- likely to occur sometimes, we think it’ll happen once in the life of an item. Remote- we don’t think it’ll happen at all, but it could do. And improbable – so unlikely for an individual item that we might assume that the occurrence won’t happen at all. But when we consider a fleet, particularly, I’ve got hundreds or thousands of items, the cumulative risk or cumulative probability, sorry, I should say, is unlikely to occur across the fleet, but it could.
And this is where this specific vs. fleet occurrence or probability is useful. For example, if we think ‘Let’s imagine a frequent hazard’. We think that something could happen to an item, per item, let’s say once a year. Now, if we’ve got a fleet of fifty of these items or fifty-something of these items, that means it’s going to happen across the fleet pretty much every week on average. That’s the difference. And sometimes it’s helpful to think about an individual system. And sometimes it’s helpful to think about a fleet where you’ve got the relevant experience to say, ‘Well the fleet that we’re replacing. We had a fleet of 100 of these things. And this went wrong every week or every month or once a year or only happened once every 10 years across the entire fleet.’ And therefore, we could reason about it that way.
We’ve got two different ways of looking at probability here. And use whichever one is more useful or helps you. But when we’re doing that, try and do that with historical data, not just subjective judgment. Because otherwise your subjective judgment, one individual might say ‘That will never happen!’, whereas another will say, ‘Well, actually we experienced it every month on our fleet!’. Circumstances are different.
Table III, Risk Matrix
We put severity and probability together. We have got ‘1’ to ‘4’ for severity, and ‘A’ to ‘F’ for probability, and we get this matrix. We’ve got probability down the side and severity along the top. And in this standard, we’ve got high risk, serious risk, medium risk and low risk. And now how exactly you define these things is, of course, somewhat arbitrary. We’ll just look at some general principles.
The good thing about this risk matrix is- First, the thing to remember is that risk is the product of probability and severity. Effectively we multiply the two together and we go, well, if we’ve got a catastrophic or critical risk. And it’s if we’ve got a more serious risk and it’s going to happen often that’s a big risk. That’s a high risk. Whereas, if we’ve got a low severity accident that we think will happen very, very rarely, then that’s a low risk. That’s great.
One thing to note here it’s easier to estimate the severity than it is the probability. It’s quite easy to under- or overestimate probability. Usually, because of the physical mechanism involved, it’s easier to estimate the severity correctly. If we look on the right-hand side, at negligible. We can see that if we’re confident that something is negligible, then it can be a low risk. But at the very most, it can only be a medium risk. We are effectively prioritizing negligible severity risks quite low down the pecking order.
Now, on the other side, if we think we’ve got a risk that could be catastrophic, we could kill somebody or do irreversible environmental damage, then, however improbable we think it is, it’s never going to be classified less than medium. That’s a good point to note. This matrix has been designed well, in the sense that all catastrophic and critical risks are never going to get the low medium and they can quite easily become serious or high. That means they’re going to get serious management attention. When you put risks up in front of a manager, senior person, a decision-maker, who’s responsible and they see red and orange, they’re going to get uncomfortable and they’re going to want to know all about that stuff. And they will want to be confident that we’ve understood the risk correctly and it’s as low as we can get it. This matrix is designed to get attention and to focus attention where it is needed.
And in this standard, in 88, you ultimately determine whether you can accept risk based on this risk rating. In 882, there is no unacceptable, intolerable risk. You can accept anything if you can persuade the right person with the right amount of authority to sign it off. And the higher the risk, the higher the level of authority you must get in order to accept the risk and expose people to it. This matrix is very important because it prioritizes attention. It prioritizes how much time and effort money gets spent on reducing risks. You will use it to rank things all the time and it also prioritizes, as we’ll see later, how often you review a risk because clearly, you don’t want to have high risks or serious risks. Those are going to get reviewed more often than a medium risk or low risk. A low risk might just get review routinely, not very often, maybe once a year or even less. We want to concentrate effort and attention on high risks and this matrix helps us to do that. But of course, no matrix is perfect.
Now, if we go back. Looking at the yellow highlight, we’re going to use table three unless there’s a tailored alternative definition, a tailored alternative matrix. Now, noting this matrix, catastrophic risk, the highest possible risk, we’ve got one death. Now, if we had a system where it was feasible to kill more than one person in an accident, then really, we would need another column worse than catastrophic. We could imagine that if you had a vehicle that had one person in it and the vehicle crashed, whatever it was, a motorbike let’s say. Let’s imagine you only said ‘We’re only going to have solo riders. We can only kill one person.’. We’re assuming we won’t hurt anybody else. But if you’ve got a car where you’ve got four or more people in, you could kill several people. If you’ve got a coach or a bus, you could drive it off a cliff and kill everybody, or you might have a fire and some people die, but most of them get out. You can see that for some vehicles, for some systems, you would need additional columns. Killing one person isn’t the worst conceivable accident.
Some systems. You might imagine quite easily, say with a ship, it’s actually very rare for a ship to sink and everybody dies. But it’s quite common for individuals on ships to die in health and safety type accidents, workplace accidents. In fact, being a merchant seaman is quite a risky occupation. But also in between those two, it’s also quite possible to have a fire or asphyxiating gases in a compartment. You can kill more than one person, but you won’t kill the entire ship’s company. Straight away in a ship, you can see there are three classes, if you like, of serious accidents where you can kill people. And we knew we should really differentiate between the three when we’re thinking about risk management. And this matrix doesn’t allow you to do that. If you’ve got a system where more than one death this is feasible, then this matrix isn’t necessarily going to serve well, because all of those types of accidents get shoved over into a catastrophic column, on this matrix, and you don’t differentiate between any of between them which is not helpful. You may need to tailor your matrix and add further columns.
And depending on the system, you might want to change the way that those risks are distributed. Because you might have a system, for example riding a bicycle. It’s very common riding a bicycle to get negligible type injuries. You know you fall off, cuts and bruises, that kind of thing. But, if you’re not on the road, let’s say you’re riding off-road it is quite rare to get utilities unless you do a mountain biking on some extreme environment. You’ve got to tailor the matrix for what you’re doing. I think we’ve talked about that enough. We’ll come back to that in later lessons, I’m sure.
Risk mitigation, we’re doing this analysis, not for the sake of it, we’re doing it because we want to do something about it. We want to reduce the risk or eliminate it if we can. 88 2 standard gives us an order of precedence, and as it says it’s specified in section 4.3.4, but I’ve reproduced that here for convenience. Ideally, we would like to eliminate hazards by designing them. We would make a design decision to say, ‘We won’t have a petrol engine, let’s say, in this vehicle or vessel because petrol is a serious fire/explosion hazard. We’ll have something else. We’ll have diesel or we’ll have an all-electric vehicle maybe these days or something like that.’ We can eliminate the risk.
We could reduce the risk by altering the design introducing sort of failsafe features, or making the design crashworthy, or whatever it might be. We could add engineered features or devices to reduce risk safety features seatbelts in cars or airbags, roll balls, crash survivable cages around the people, whatever it might be. We can provide warning devices to say ‘Something’s going wrong here, and you need to pull over’ or whatever it is you need to do. ‘Watch out!’ because the system is failing and maybe ‘Your brakes are failing. You’ve got low brake fluid. Time to pull over now before it gets worse!’.
And then finally, the least effective precautions or mitigations signage, warning signs – because nobody reads warning signs, sadly. Procedures. Good, if they’re followed. Again, very often people don’t follow them. They cut corners. We train people. Again, they don’t always listen to the training or carry it out. And we provide PPE. That’s personal protective equipment. And again, PPE is great if you enforce it. For example, I live in Australia. If you cycle in Australia, if you ride a bicycle, it’s the law that you wear a bike helmet. Most people obey the law because they don’t want to get a $300 fine or whatever it is if the cops catch you, but you still see people around who don’t wear one. Presumably, because they think they’re bulletproof, and it will never happen to them.
PPE is fine if it’s useful. But of course, sometimes PPE can make a job so much harder that people discard it. We really need to think about designing a job to make it easy to do, if we’re going to ask people to wear awkward PPE. Also, by the way, we need to not ask them to wear PPE for trivial reasons just so that the managers can cover their backsides. If you ask people to wear PPE when they’re doing trivial jobs where they don’t need it then it brings the system into disrepute. And then people end up not wearing PPE for jobs where they really should be wearing it. You can over-specify safety and lose goodwill amongst your workers if you’re not careful.
Now those risk mitigation priorities, that’s the one in this standard, but you will see an order of precedence like that in many different countries in the law. It’s the law in Australia. It’s the law in the UK, for example, expressed slightly differently. It’s in lots of different standards for good reason because we want to design out the risks. We want to reduce them in the design because that’s more effective than trying to bolt on or stick home safety afterwards. And that’s another reason why we want to get in early in a project and think about our hazards and our risks early on. Because it’s cheaper at an early stage to say, ‘We will insist on certain things in the design. We will change the requirements to favour a design that is inherently safe.’
We only get these things if we contract for them. The model in 88 2, the assumption is it’s a government somewhere contracting a contractor to do stuff. But it doesn’t have to be a government, it can be any client or purchase of world authority or end-user asking for something, buying something, contracting something, be it the physical system, or service, or whatever it might be. The assumption is that the client issues a request for proposal.
Right at the start, they say ‘I want a gizmo’. Or ‘I want- I don’t even want to specify that I want a gizmo. I want something that will do this job. I don’t care what it is. Give me something that will do this job.’ But even at that early stage, we should be asking for preliminary hazard analysis (PHA) to be done. We should be saying, ‘Well, who?’ ‘Which specialists?’ ‘Which functional disciplines need to be involved?’. We need to specify the data that we require and the format that it’s in. Considering, especially the tracking system, which is task 106. If we’re going to get data from lots of different people, best we get it in a standardized format we can put it all together. We want to insist that they identify hazards, hazardous locations, etc. We want to insist on getting technical data on non-developmental items, either getting it for the client or the client supplies it. Says to the contractor or doing it ‘This is the information that I’m going to supply you’ and you will use it. We need to supply the concept of operations and of course, the operating environment. Let me just check, no that that’s it. We’ve only got one slide on commentary. It doesn’t say the environment, but we do need to specify that as well, and hopefully, that should be in the concept of operations, and a specific hazard management requirement. For example, what matrix are we going to use? What is a suitable matrix to use for this system?
Now to do all of this, the purchaser, the client really probably needs to have done Task 202 and 201 themselves, and they’ve done some thinking about all of this in order to say, ‘With this system, we can envisage- with this kind of requirement, we can envisage these risks might be applicable.’ And ‘We think that the risks might be large or small’ depending on what the system is or ‘We think that-’. Let’s say if you purchase a jet fighter, jet fighters because of that demand, the overwhelming demand for performance, they tend to be a bit riskier than airliners. They fall out of the sky more often. But the advantage is that there are normally only one or two people on board. And jet fighters tend to fly a lot of the time in the middle of nowhere. You’re likely to hurt relatively few people, but it happens more often.
Whereas if you’re buying an airliner something, you can shove a couple of hundred people in at one go, those fall out of the sky much less frequently, thank goodness, but when they do, lots of people get hurt. Aa different approach to risk might be appropriate for different types of system. And when your, you should be thinking about early on, if you’re the client, if you’re the purchaser. You should have done some analysis to enable you to write a good request for proposal because if you write a bad request for proposal, it’s very difficult to recover the situation afterwards because you start at a disadvantage. And the only way often to fix it is to reissue the RFP and start again. And of course, nobody wants to do that because it’s expensive and it wastes a lot of time. And it’s very embarrassing. It is a career-limiting thing to do, a lot of people. You do need to do some work upfront in order to get your RFP correct. That’s what it says in the standard.
I want to add a couple of comments, I’m not going to say the much. First, it’s a little line from a poem by Kipling that I find very, very helpful. And Kipling used to be a journalist and it was his job to go out and find out what the story was and report it. And to do that he used his six honest serving men. He asked ‘What?’ and ‘Why?’ and “When?’ and ‘Who?’, sorry, and ‘How?’ and ‘Where?’ and ‘Who?’. Those are all good questions to ask. If you can ask all those questions and get definite answers, you’re doing well. And a little tip here as a consultant, I rock up and one of the tricks of the trade I use is I turn up as the ‘dumb consultant’ – I always pretend to be a bit dumber than I really am- and I ask these stupid questions. And I ask the same questions to several different people. And if I get the same answer to the same question from everyone, I’m happy. But that doesn’t always happen. If you start getting very different answers to the same question from different people, then you think, ‘Okay, I need to do some more digging here’. And it’s the same with hazard analysis. Ask the what, why, when, where and who questions.
Another issue, of course, is ‘How much?’ ‘How much is this going to take?’ ‘How long is this going to take?’ ‘How many people am I going to have to invite to this meeting?’, etc. And that’s difficult. And really, the only way to answer these questions properly is to just do some PHI and PHA early and to learn from the results. The other alternative, which we are really good as human beings, is to ask the questions early to get answers that we don’t really like and then just to sweep them under the carpet and not ask those questions ever again because we’re frightened of the answers that we might. However frightened you are of the answer, you might get do ask the question because forewarned is forearmed. And if you know about a problem, you can do something about it. Even if that something is to rewrite your CV and start looking for another job. Do ask the questions even if it makes people uncomfortable. And I guess learning how to ask the questions without making people uncomfortable is one of the tricks that we must learn as safety engineers and consultants. And that’s an important part of the job. The soft skills really that you can only learn through practice, really, and observing people.
What’s the way to do it? Well, I’ve said this several times but do your PHI and PHA early. Do it as early as possible because it’s cheap to do it early. If you’re the only safety person or safety, you often in the beginning, maybe you’re a manager, maybe safety is part of your portfolio, you’ve got other responsibilities as well. Just sit down one day and ask these dumb questions, go through the checklist in Task 202 and say, ‘Do I have these things in my system?’
If you know for sure you’re not going to have explosive ordnance, or radiation, or whatever it might be, you can go, ‘Great. I can cross those off the list’. I can make an assumption or I can put a constraint in, by the way, if you really want to do it well and say ‘We will have no explosive devices’, ‘We will have no energetic materials.’, ‘We will have no radiation’ or whatever it might be. Make sure that you insist that you’ll have none of it then you can hopefully move on and never have to deal with those issues again.
Do the analysis early, but expect to repeat it because things change, and you learn more and more information comes in. But of course, the further you go down the project, the more expensive everything gets. Now, having said do it, do it early, the Catch 22 is very often people think ‘How can I analyse when I don’t have a design?’
The ‘Catch-22’ question is what comes first, design or analysis? Now, the truth is that you could do an analysis of very simple functions. You don’t need any design at all. You don’t even need to know what kind of vehicle or what kind of system you might be dealing with. But of course, that will only take you so far. And it may be that you want to do early analysis, but for whatever reason, [Intellectual Property Rights] IPR or whatever it might be, you can’t get access to data.
What do you do? You can’t get access to data about your system or the system that you’re replacing. What do you do? Well, one of the things you can do is you can borrow an idea from the logistics people. Logistic support analysis Task 203 is a baseline comparison system. Imagine that you’re going to have a new system, maybe is replacing an old system, but maybe it does a lot more than the old system used to do. Just looking at the old system isn’t going to give you the full picture. Maybe what you need to do is make up an imaginary comparison system. You take the old system and say, ‘Well, I’m adding all this extra functionality’. Maybe the old system, we just bought the vehicle. We didn’t buy the support system, we didn’t buy the weapons, we didn’t buy the training, whatever it might be. But, this time around, we’re buying the complete package. We’re going to have all this extra stuff that probably has hazards associated with it, but just doing lessons learned from the previous system will not be enough.
Maybe you need to construct an imaginary Baseline Comparison System and go, ‘I’ll borrow bits from all these other systems, put them all together, and then try and learn from that sort of composite system that I’ve invented, even though it’s imaginary.’ That can be a very powerful technique. You may get told, ‘Oh, we haven’t got the money’ or ‘We haven’t got the time to do that’. But to be honest, if there’s no other way of doing effective, early analysis, then spend the money and do it early. Because many times I’ve seen people go, ‘Oh, we haven’t got time to do that’. They’ve never got time to do it properly and therefore, you end up doing it. You go around the buoy two or three times. You do it badly. You do it again slightly less badly. You do it a third time. And it’s sort of barely adequate. And then you move forward. Well, you’ve wasted an awful lot of time and money and held up other people, the rest of the project doing that. Probably it’s better off to spend the money and just get on with it. And then you’re informed going forwards before you start to spend serious money elsewhere on the project.
Well, that’s it for me. Just one thing to say, that Mil. Standard 882E came out in 2012. Still going strong, unlikely to be replaced anytime soon. It’s copyright free. All the quotations are from the standard, they’re copyright free. But this video is copyright of The Safety Artisan 2020.
For More …
That is the end of the show. Thank you very much for listening. And it just remains for me to say. Come and watch some more videos on Mill-Std-882E. There’s going to be a complete course on them, and you should be able to get, I hope, a lot of value out of the course. So, until I see you again, cheers.
This is Mil-Std-882E Preliminary Hazard List & Analysis.
Back to: 100-series Tasks.
The 200-series tasks fall into several natural groups. Tasks 201 and 202 address the generation of a Preliminary Hazard List and the conduct of Preliminary Hazard Analysis, respectively.
TASK 201 PRELIMINARY HAZARD LIST
201.1 Purpose. Task 201 is to compile a list of potential hazards early in development.
201.2 Task description. The contractor shall:
201.2.1 Examine the system shortly after the materiel solution analysis begins and compile a Preliminary Hazard List (PHL) identifying potential hazards inherent in the concept.
201.2.2 Review historical documentation on similar and legacy systems, including but not limited to:
- a. Mishap and incident reports.
- b. Hazard tracking systems.
- c. Lessons learned.
- d. Safety analyses and assessments.
- e. Health hazard information.
- f. Test documentation.
- g. Environmental issues at potential locations for system testing, training, fielding/basing, and maintenance (organizational and depot).
- h. Documentation associated with National Environmental Policy Act (NEPA) and Executive Order (EO) 12114, Environmental Effects Abroad of Major Federal Actions.
- i. Demilitarization and disposal plans.
201.2.3 The contractor shall document identified hazards in the Hazard Tracking System (HTS). Contents and formats will be as agreed upon between the contractor and the Program Office. Unless otherwise specified in 201.3.d, minimum content shall included:
- a. A brief description of the hazard.
- b. The causal factor(s) for each identified hazard.
201.3 Details to be specified. The Request for Proposal (RFP) and Statement of Work (SOW) shall include the following, as applicable:
- a. Imposition of Task 201. (R)
- b. Identification of functional discipline(s) to be addressed by this task. (R)
- c. Guidance on obtaining access to Government documentation.
- d. Content and format requirements for the PHL.
- e. Concept of operations.
- f. Other specific hazard management requirements, e.g., specific risk definitions and matrix to be used on this program.
- g. References and sources of hazard identification.
TASK 202 PRELIMINARY HAZARD ANALYSIS
202.1 Purpose. Task 202 is to perform and document a Preliminary Hazard Analysis (PHA) to identify hazards, assess the initial risks, and identify potential mitigation measures.
202.2 Task description. The contractor shall perform and document a PHA to determine initial risk assessments of identified hazards. Hazards associated with the proposed design or function shall be evaluated for severity and probability based on the best available data, including mishap data (as accessible) from similar systems, legacy systems, and other lessons learned. Provisions, alternatives, and mitigation measures to eliminate hazards or reduce associated risk shall be included.
202.2.1 The contractor shall document the results of the PHA in the Hazard Tracking System (HTS).
202.2.2 The PHA shall identify hazards by considering the potential contribution to subsystem or system mishaps from:
- a. System components.
- b. Energy sources.
- c. Ordnance.
- d. Hazardous Materials (HAZMAT).
- e. Interfaces and controls.
- f. Interface considerations to other systems when in a network or System-of-Systems (SoS) architecture.
- g. Material compatibilities.
- h. Inadvertent activation.
- i. Commercial-Off-the-Shelf (COTS), Government-Off-the-Shelf (GOTS), NonDevelopmental Items (NDIs), and Government-Furnished Equipment (GFE).
- j. Software, including software developed by other contractors or sources. Design criteria to control safety-significant software commands and responses (e.g., inadvertent command, failure to command, untimely command or responses, and inappropriate magnitude) shall be identified, and appropriate action shall be taken to incorporate these into the software (and related hardware) specifications.
- k. Operating environment and constraints.
- l. Procedures for operating, test, maintenance, built-in-test, diagnostics, emergencies, explosive ordnance render-safe and emergency disposal.
- m. Modes.
- n. Health hazards.
- o. Environmental impacts.
- p. Human factors engineering and human error analysis of operator functions, tasks, and requirements.
- q. Life support requirements and safety implications in manned systems, including crash safety, egress, rescue, survival, and salvage.
- r. Event-unique hazards.
- s. Built infrastructure, real property installed equipment, and support equipment.
- t. Malfunctions of the SoS, system, subsystems, components, or software.
202.2.3 For each identified hazard, the PHA shall include an initial risk assessment. The definitions in Tables I and II, and the Risk Assessment Codes (RACs) in Table III shall be used, unless tailored alternative definitions and/or a tailored matrix are formally approved in accordance with Department of Defense (DoD) Component policy.
202.2.4 For each identified hazard, the PHA shall identify potential risk mitigation measures using the system safety design order of precedence specified in 4.3.4.
202.3 Details to be specified. The Request for Proposal (RFP) and Statement of Work (SOW) shall include the following, as applicable:
- a. Imposition of Task 202. (R)
- b. Identification of functional discipline(s) to be addressed by this task. (R)
- c. Special data elements, format, or data reporting requirements (consider Task 106, Hazard Tracking System).
- d. Identification of hazards, hazardous areas, or other specific items to be examined or excluded.
- e. Technical data on COTS, GOTS, NDIs, and GFE to enable the contractor to accomplish the defined task.
- f. Concept of operations.
- g. Other specific hazard management requirements, e.g., specific risk definitions and matrix to be used on this program.
Forward to the next excerpt: Task 203