Here is the full transcript: Systems Requirements Hazard Analysis.
The full video is here.
Hello and welcome to the Safety Artisan, where you will find professional, pragmatic and impartial advice on all things system, safety and related.
System Requirements Hazard Analysis
And so today, which is the 1st of March 2020, we’re going to be talking about – let me just find it for you – we’ll be talking about system requirements, hazard analysis. And this is part of our series on Mil. Standard 882E (882 Echo) and this one a task 203. Task 203 in the Mil. standard. And it’s a very widely used system safety engineering standard and its influence is found in many places, not just on military procurement programs.
Topics for this Session
We’re going to look at this task, which is very important, possibly the most important task of all, as we’ll see. so in to talk about the purpose of the task, which is word for word from the task description itself. We’re going to talk about in the task description, the three aims of this task, which is to determine or work out requirements, incorporate them, and then assess the compliance of the system with those requirements, because, of course, it may not be a simple read-across. We’ve got six slides on that. That’s most of the task. Then we’ve just got one slide on contracting, which if you’ve seen any of the others in this series, will seem very familiar. We’ve got a little bit of a chat about Section 4.2 from the standard and some commentary, and the reason for that will become clear. So, let’s crack on.
Purpose of SRHA
Task 203.1, the purpose of Task 203 is to perform and document a System Requirements Hazard Analysis or SRHA. And as we’ve already said, the purpose of this is to determine the design requirements. We’re going to focus on design rather than buying stuff off the shelf – we’ll talk about the implications of that a little bit later. Design requirements to eliminate or reduce hazards and risks, incorporate those requirements, into a says, into the documentation, but what it should say is incorporate risk reduction measures into the system itself and then document it. And then finally, to assess compliance of the system with these requirements. Then it says the SRHA address addresses all life-cycle phases, so not just meant for you to think about certain phases of the program. What are the requirements through life for the system? And in all modes. Whether it’s in operation, whether it’s in maintenance or refit, whether it’s being repaired or disposed of, whatever it might be.
Task Description #1
First of six slides on the task description. I’m using more than one colour because there’s some quite a lot of important points packed quite tightly together in this description. We’re assuming that the contractor performs and documents this SRHA. The customer needs to do a lot of work here before ever gets near a contractor. More on that later. We need to determine system design requirements to eliminate hazards or reduce associated risks.
Two things here. By identifying applicable policies, regulations and standards etc. More on that later. And analysing identified hazards. So, requirements to perform the analysis as well as to simply just state ‘We want a system to do this and not to do that’. So, we need to put some requirements to say ‘Here’s what we want to be analysed, to what degree? And why.’ is always helpful.
Task Description #2
Breaking those breaking those two requirements down.
Part a. We’re going to identify applicable requirements by reviewing our military and industry standards and specs, historical documentation of systems that are similar or with a system that we’re replacing, perhaps. Look at, it’s assumed that the US Department of Defense is the customer, ultimate customer. So, the ultimate customer’s requirements, including whatever they’ve said about standard ways of mitigating certain common risks. System performance spec, that’s your functional performance spec or whatever you want to call it. Other system design requirements and documents- Bit of a catchall there. And applicable federal, military, state and local regulations.
This is a US standard. It’s a federated system, much like Australia or indeed lots of modern states, even the UK. There are variations in law across England, Wales, Scotland and Ireland. They’re not great, but they do exist. And in the US and Australia, those differences are greater. And it says applicable executive orders. Executive orders, they’re not law, but they are what the executive arm of the U.S. government has issued, and international agreements. A lot of words in there- have a look at the different statements that are in that in white, blue and yellow. Basically, from international agreements right down to whatever requirements may be applicable, they all need to be looked at and taken account of. So, there’s a huge amount of work there for someone to do. I’ll come back to who that someone should be later.
Task Description #3
Part B. It says the contractor shall recommend appropriate system design requirements. The assumption here is that the contractor is the designer and knows the design better than anybody, better than the purchaser, which is fair enough. It’s your system, you should understand it. And the requirement is that the contractor is not just passive, ‘doing as they’re told’, they’re there to actively investigate possible hazards associated with their system and recommend appropriate requirements in order to manage those hazards and risks. And then there’s further guidance here is the contractor to do that in accordance with Section 4 of Mil. Standard 882E. Now, Section 4 is the general requirements of the standards and there’s lots of good advice in that. And I’ll be doing a lesson, maybe more than one lesson in fact, in Section 4 because there is quite a lot in there. The contractor is to refer to the standard and apply the principles therein. All good stuff.
Part C. The contractor shall also define verification and validation approaches. So, the contractor shall define V and V approaches for each design requirement to eliminate hazards and reduce risks. In part C- Well, B and C- we’ve got a very much narrower focus on requirements to eliminate hazards or reduce risks. Whereas in A, notice we’ve got incredibly broad scope looking requirements. It’s not just about the narrow job of dealing with hazards and controlling them, that we’ve got in parts B and C.
Task Description #4
Onwards and upwards. We get to the second major part of this task, which is to incorporate those design requirements. It’s all very well to have them, but they’ve got to be built into the engineering design, into documentation, hardware, software, test plans, etc. And the second highlighted bit that I’ve got is ‘as the design evolves ensure applicable design requirements flow down into lower-level specifications’, etc, etc, etc. There’s a lot of repetition there, so I won’t go through it. Clearly the assumption in this standard is that the design will be done top-down and that the main contractor, design contractor, will be doing work and then identifying lower-level requirements to be passed on to subcontractors and suppliers. And again, the assumption is we’re dealing with a large military system, which is at least, in part, bespoke. It is being developed and/or integrated for the first time for a specific user and specific use.
I’ll come onto the third yellow highlighted bit first, and then it says as appropriate use engineering change proposals to incorporate applicable design requirements into these documents. What we’re saying here is that even if something hasn’t been specified upfront in the original contract, the contractor should use Engineering Change Proposals – ECP – should use it controlled change mechanism in order to change things as they go with approval and refine and evolve the design.
Years of experience have taught me that these statements are coming from the assumption – still true in the US, I believe – whereby major military projects are designed and developed under a cost-plus basis. In other words, the government pays the main contractor / the prime contractor / prime designer on a sort of time and materials basis, not on a firm or fixed price basis, but says ‘Go away and do what we say’. And there are controls there, and there’s open-book accounting to try and prevent the government from being defrauded. But basically, the contractor goes off and does what is required and gets paid for what they do. So, the government has transferred relatively low amounts of risk onto the contractor anticipating that this will result in the lowest possible overall cost of design development. Now, as we probably could know from the news, that doesn’t always work. However, that is the assumption behind this standard. This cost-plus approach will pay you to do the job and therefore we don’t have to specify every single nut and bolt in the contract right at the beginning. Which in some ways takes a lot of risks away from the purchaser because they don’t have to get everything right at the start. So that’s good. There’s always a balance of risk in whichever approach we take.
So, if we go firm price, yes, we could inject more competition into procurement and supply activity, but you’ve got to get your contract upfront right. And all your requirements, right- more or less. That is notoriously difficult to do. Whichever way you go, there are risks. But it’s important to note that this is the assumption underlying the standard. Not every standard follows this approach, follows this philosophy, but 88 2 does. So, if we’re going to use it in a different way, we need to understand the fact that in. More on that later.
Task Description #5
Fifth slide of six. Third part. We need to assess compliance of that development of hardware, software, documentation, data, etc., whatever it might be. In order to do that, the contractor is going to have to address the customer requirements at technical reviews. So again, the assumption is that development is following a systems-engineering process with certain gated reviews. So, you go into a series of reviews, you might start with system requirements review, SRR. Then you might have preliminary design review, top-level design, PDR. And then we go down to detailed design which is reviewed at Critical Design Review, or CDR. And then we might have a further software specification review for software components and then we’ll go on and test readiness routines and so on and so forth.
Mil. Standard 882 is assuming a particular systems-engineering-lifecycle approach to development. This is very widely used not just for military standards, but for civil, and all over the place. Whatever we call these reviews, the idea of a gated review is that you don’t start a review until you’ve reached maturity requirements or design. You then conduct the review against objective criteria and then decide whether the review has passed. Now, usually, there is a hefty payment milestone associated with passing review. The contractor is incentivized to pass the review. And hopefully, if we’ve got the requirements right, a passed review means we’re on the right track and we’re getting the right product. But that’s not always the case that we’ve got to get all these things right.
And then it says during those reviews, the contractor shall address hazards, mitigation measures or controls and methods of V and V, and recommendations arising. A lot goes on at these reviews. They are on big programs, especially, the very important, very high stress. And in fact, in Australia now, there are some projects that are so big that a delay in a PDR review actually made it into the national news on the future submarine because it’s such a huge multi-billion-dollar project. It could all get very painful and political as well.
Task Description #6
However, let’s move on to the final slide of the task description. So, A. was is do the reviews. B. is review test plans and review test results to make sure to verify and validate hardware and software compliance with those requirements. And as it says, this includes V and V of the effectiveness of risk mitigation measures. So, we need to test these risk controls where we can and see how effective they are and whether they live up to the requirements or the assumptions that we’ve made. Now, again, this is an American standard, so it’s very ‘test centric’. The American government likes to test things to death and depending on your point of view, that’s sensible or not, it’s sensible in the sense that you’re testing a real system hopefully in a representative test environment. Although it may not be representative of the operational environment. So, it should be a very solid, robust, valid approach to proving a system.
However, there is a downside to testing in that it’s very expensive and it tends to come at the end of a program. Whereas really you need an indication much earlier on if things are going astray. So, you really need to review documentation and do analysis and so forth. Or maybe you test a prototype for some samples or something early on, rather than waiting until yet when it’s often may be too late and then very expensive to fix things.
And then part C, we need to ensure that hazard control information is incorporated into manuals and plans, whether it be for the operator, the maintainer, the trainer, the logistician, the diagnostics or indeed for the final disposal. We need to take that hazard control information, risk control information, and record it so that it doesn’t get lost and it gets to the people who need it. That’s very important.
OK, so we’ve spent quite a lot of time going through the description because it’s a big, complex task this one, as you can see, with three major parts to it. It’s worth just going back over it. We’ve got our top-level description on slide one, which summarizes the whole thing. We’re talking about finding those requirements, identifying them. We’re talking about the contractor as an active recommender and developer of requirements and actively developing the V and V techniques to make sure that they are met.
In the second major part, we’re talking about incorporating those design requirements as the design evolves and using a controlled change method to make sure that we keep up with what’s going on. We’re talking about assessing compliance both at major systems engineering reviews and during testing. And then finally, we’re talking about making sure that the required information gets through to those who need it at the end of the food chain, as it were. [This is ] all important stuff.
Here’s as a page we should be familiar with by now, contracting. We need to require SRHA, Task 203. We need to put it in the request for proposal and the contractual state, the work. So once again, as I’ve said before, we’ve got to get this stuff in early on. At least the requirement to do it, even if we haven’t fully worked everything out. We need to get that in right at the start of the request for proposal. We need to require task 203 to be done. It’s imposed (A. Imposition of Task 203).
We need to identify (B. Identification of functional disciplines) who we want to take part in it because it’s not, as we will see, it’s not just the discipline and the job of the safety engineers or the safety team to do this. The design engineers, the specialist engineers in reliability, maintainability and testability, whoever, they all need to be involved as well, etc, etc.
Contractor level of effort (C.) for reviews and so on. We may need to specify some hard requirements there to ensure that we get early scrutiny of the product and the design.
A big point is tailoring of the task (D. Tailor 203.2 and 203.2.3 as appropriate). The task may need to be tailored assuming again that the contractor is responsible for the design. Maybe if the prime contractor isn’t responsible for the design, maybe we’re contracting somebody to buy something that’s mostly off the shelf and then operating force for 30 years. Let’s say a so-called turnkey solution. And we might do that for a piece of military kit, or we might do that for a hospital, or whatever it might be. A piece of infrastructure, a service, whatever. So, it may be that the contractor who must do most of task 203 is not the Prime at all. But, the prime needs to pass those requirements down to some key subcontractors who are doing the development stuff. So, it’s not a given that the prime contractor right underneath the customer must do all this stuff. It may have to be done at several different levels.
And again, we’ve got to provide the concept of operations (E.), that gives the context for all this work. Otherwise, it gets very difficult to do it. You’ve got to say, ‘What’s the jurisdictional context?’ ‘Where will we be operating under?’ ‘Which rules and conditions?’ As well as everything else that you would find in Con. Ops (Concept of Operations).
Then if there are any specific hazard management requirements (F.) that need to be imposed and specific measures of risk, then they need to be passed on to the contractor as well. This is how we will assess, and measure, and prioritize risks. That needs to be done for the program otherwise, you can end up with lots of different ways doing it and it becomes difficult to govern mess.
Section 4.2 #1
I promised we would have a little section on Section 4.2 in the standard and I’ve got two slides here that say two important things. We’re not going to go through all of Section 4 of the 882- That’s for another session. But here in 4.2, we’ve got two important things.
It says Section 4 defines system safety requirements through life for any system. And when properly applied, these requirements should enable the identification and management of hazards and their associated risks. Not only during system development but also during sustainment. And any engineering activities that go on in sustainment, whether it be repair, overhaul, modification, update, whatever it might be. These requirements are put in place to enable that good work to take place and make predictions for the through-life operation, support, sustainment of system, whatever it might be.
Section 4.2 #2
And then secondly, there’s another important point here, which I alluded to earlier. System safety staff are not responsible for hazard management in other functional disciplines. If you’re a structural designer, you’re responsible for making your structure or designing your structure such that risks of failure and collapse and catastrophe are managed. And the same for everything else. Whatever it is you’re dealing with, propulsion, fuels, you name it, whatever the discipline is, they’re all responsible for managing the risks.
The safety team is there really to pull it together and try and ensure some consistency and honesty and to report status. They are not there to do it all for the designers. Indeed, they can’t because they will not have the design specialist knowledge to do so. Only the designers can do. But it does go on to say all functional disciplines, using this generic methodology that’s in Section 4, should coordinate their efforts as part of the overall systems engineering process. The standard provides standardization and it should force all these different disciplines to work together in a standardized way following a standardized-systems-engineering process. And remember we said earlier, Mil. standard 882 assumes that there is a higher-level systems-engineering process going on into which the safety program fits. And that’s very, very important.
On so many programs I’ve seen, there’s either no systems engineering process or a weak one. Or the safety program is divorced or isolated from the systems engineering, the higher-level program, and as a result, it can become irrelevant if you’re not careful. So, having these things and making sure that they lock together is very important. And the reasoning given here is because you might mitigate a hazard in one discipline only to make it worse for somebody else. We can all think of examples of one (which is code for me saying I can’t right now). But anyway, trade-offs – that’s what we end up with. There’s Section 4.2, which gives us a little insight into the thrust of the whole of section 4.
Just two slides of commentary for me. First, it’s worth remembering that there are lots, and lots, and lots of requirements. We’ve got requirements of the standard itself, which is about following a rigorous process. We’ve got law at the international and national levels, and whether those laws apply in a particular jurisdiction or not can be complex. You’ve got product specifications; you’ve got applicable standards, or maybe only parts of the standards that are applicable to your system. And then you’ve got program project requirements, etc., etc. You’ve got lots and lots of layers of requirements that are out there and may or may not be relevant to your system you want to develop, or service, whatever it is going to be. But of course, if we’re using this kind of approach, it’s going to be a complex system or service. It’s going to be challenging to find and identify all these things. It’s going to take some dedicated effort.
That’s one issue, doing all that work. And this is not a trivial exercise and I’ve seen it done badly far more often than I’ve seen it done well. That’s the thing to bear in mind, this is not easy to do. And people didn’t really want to do it – it’s hard work.
And then secondly, we get down to what we might call derived safety requirements. We have a high-level requirement that says, ‘We want a very high level of performance out of this vehicle’ or whatever it might be. And that very demanding performance requirement might force us to use some very high energy fuel, or it might force us to pack a lot of power and a lot of equipment into a very small space, and these requirements can lead to sort of secondary hazards. So, we’ve got high energy fuel inside the vehicle- Well, clearly, that’s dangerous if it leaks. We’ve got a lot of stuff, complex stuff, packed into a small system that can give us thermal control problems. Or if a bit of it goes wrong, if it’s tightly packed together, it can take out something else next to it.
So, these performance requirements can cause hazards that probably weren’t there before or needn’t have been there in, let’s say, a common or garden system that doesn’t have to perform as well. So, we might well look at doing some analysis on our requirements and our top-level design or conceptual design, whatever it might be very early on. And we might say, ‘Well, clearly this is going to drive us down a particular path’ and therefore we will derive some additional safety requirements to deal with these challenges. They don’t come out straight out of higher-level requirements, they’re a secondary effect. But in complex systems, these are very common. And if we’re doing our systems engineering well, we will identify, derive safety requirements for ourselves and for the next level of contractors down the chain.
So, instead of just passing on ‘back-to-back’ requirements from the ultimate customer, which may not mean anything at all to the component supplier (in fact, it probably won’t). We need to change these top-level requirements and say, ‘What’s relevant for you as the supplier role of the engine?’ Let’s say or the wheels, or the wings, or the hull, or whatever it might be. We need to pass on required controls, whether it be the prevention of hazards, detection or mitigation. We also need to remember the order of precedence. It’s preferable to eliminate hazards if we can’t, we put in engineering- engineered features- to reduce the risk or lessen the probability, or severity, etc. And those rules are in section 4.3.4 of the Mil. Standard. There’s a lot of work to do on requirements on many different levels and it may be that this task must be repeated at many different levels.
But the first level task must be done by the client, and actually by the ultimate end-user because to mangle a famous quote, ‘What you don’t specify – what you don’t see can hurt you’. So, we need to do this work as end-users, and as purchases, as customers. It is tempting to assume that the contractors will just do it, that they’ll just get it. ‘They’ve been making planes for years’ or ‘They’ve been making tanks’, or boots, or guns, or ships, or whatever it might be. ‘They’ve been making fuel for years’, ‘these chemicals for years’. We just assume that they know what they’re doing. Well, they probably do know what they’re doing within a particular context. However, if we impose competition, as we always do because we’re always looking for value for money, and whether we have a competition where we’re asking for a firm price to do something or whether we employ other methods of competition and cost-cutting, that will always be pressure on the contract costs. And that means they will be tempted to tailor the safety approach they’re taking in order to reduce costs. Which is a perfectly legitimate thing to do, nothing immoral about doing that, if it’s done appropriately and sensibly.
But if you as the customer or client are going to incentivize your suppliers to do that, you need to be aware of that and the fact that may just not bother because you haven’t told them to. You’re not contractually specified it so you aren’t going to get it. It’s not their problem. And indeed, the suppliers may not understand how their customer will integrate what they provide or use it. The prime contractor may not have a great idea as to how you’re going to use their product. And you can be certain that the subcontractors and the low level secondary and tertiary suppliers are probably going to have no clue whatsoever about what’s going to happen to their components. They are just not going to know. So, you need to specify that as purchaser and you need to make sure that your immediate suppliers pass on those requirements, and that context, and that they police the contract appropriately. Otherwise, there’s going to be trouble for the ultimate client and end-user.
And then finally, in these days of globalization and business-to-business and international procurement, you may be – probably are – buying stuff that’s been made abroad and designed in another country where they may have completely different laws or no laws at all on how safety is built-in – designed in – to a system. And of course, you don’t always know where design work is going to get done; just because you engage a prime contractor in your own country and think that you’re safe. You don’t know whether the prime contractor is going to subcontract software development – let’s say, out to India. It’s so common it’s a cliché! But there are certain things that tend to be done offshore because it’s cheaper, or quicker, or whatever. Or because somebody has already got a system that you can just plug in and use – allegedly.
There are all kinds of reasons why your supply chain will not necessarily ‘Just get it’, or ‘Just do it”’. In fact, there are lots of good reasons why they won’t. So, the purchaser has got to do a lot of work. It’s critical for the purchaser to know what their obligations are because a lot of purchasers don’t. They sit there in blithe ignorance of what their safety responsibilities are, and the lucky ones get away with it. And the unlucky ones are either killed or maimed, or they kill or maim somebody else and they end up going to jail or massive fines. But you’ve not only got to understand the requirements, the obligations, safety on the end item being used but how do you translate that to the contractors, because it’s not always obvious. You can’t just say, ‘Well, these are the laws that I have to obey- I’ll just pass those on to you, Mr Contractor’ because they may not apply to the contractor if they’re in a different country.
Or it just may not make any sense at their level. Laws that were designed to protect people will not often make much sense to a component supplier. Just doesn’t work. Two important points there on the commentary. Lots of layers of requirements that need to be worked on. This is all classic systems engineering stuff, isn’t it? And then the purchaser and the end-user cannot evade their responsibilities at the top of the food chain. Indeed, they’ll be stuck with the problem, whatever it is, for 30 years or however long they use the system.
It’s important for the end-user and the ultimate client to do this work may be several times at many different layers.
Well, that’s the end of the technical content. I just wanted to say that I’ve quoted a lot of text from the Mil, standard, which is itself copyright-free, and it’s available for free online, including on the Web site the Safety Artisan. But this presentation’s copyright of the Safety Artisan 2020.
For More …
And for more resources and for more videos like this one, please go to either www.safetyartisan.com or go to the Safety Artisan page at www.patreon.com.
Well, that is the end of the presentation. And it just remains for me to say thanks again for watching and do look out for the next sessions in the series on 882 echo (882E). There are quite a few to go. We’re going to go through all the tasks and the general and specific requirements of the standard and the appendices. We will also talk about more advanced topics, about how we manage and apply all this stuff.
So, from The Safety Artisan.com, thanks very much and goodbye.