Categories
Blog Safety Management

What Have Hazard Logs Ever Done for Us?

“What Have Hazard Logs Ever Done for Us? Well, there’s the aqueduct…” Monty Python’s Flying Circus may not be an obvious connection to hazard management, but it works! Hazard Logs – or Hazard Tracking Systems (HTS), which is a better term – are underappreciated but vital tools.

In this webinar on hazard logs, one of the topics that I will be covering is what a ‘full-function’ HTS can do. By that, I mean a purpose-built database, rather than just a spreadsheet. So here is a taster of the benefits, derived from my 25+ years of experience on system safety programs, large and small.

Key Elements of a Hazard Log

An HTS pulls together key safety data about:

  • Accidents: ‘An unintended event, or sequence of events, that causes harm’;
  • Accident Sequences: ‘The progression of events that results in an accident’;
  • Causes: that may lead to a hazard;
  • Controls (or mitigations): ‘A measure that, when implemented, reduces risk.’; and, of course
  • Hazards: ‘A physical situation or state of a system, often following from some initiating event, that may lead to an accident’.
Accident Sequence
Accident Sequence

Understanding how causes lead to hazards, and hazards lead to consequences, which may include (harmful) accidents, is fundamental to understanding accident sequences. This in turn helps us to understand the mechanisms that lead to harm – and defeat them.

Managing Many-to-Many Connections

A Hazard Log doesn’t just store data elements, it links those data together meaningfully to make information. A relational database does this by allowing us to make many-to-many connections between different classes of data.

Humorous illustration of linkages between data types in a Hazard Log or HTS
Hazard Log Connections

This allows us to do a lot of useful things. We’ve already mentioned understanding the mechanisms behind accident sequences. This allows us to design or select effective controls to interrupt the accident sequences and prevent harm.

Discovering Pathways to Harm

Understanding these links also enables us to see connections, for example between causes and accidents, which we had not seen before. This is important, as many severe accidents arise from unanticipated pathways to harm, perhaps in very specific circumstances. (For example, not shutting the bow doors of a ferry properly led to the flooding and capsizing of the Herald of Free Enterprise, killing 193 people.)

Change Impact Analysis

Understanding these connections also allows us to perform safety change impact analysis (‘the analysis of changes within a deployed product or application and their potential consequences’). In many programs I worked on, in-use incidents revealed that:

  • Designs were not working as intended;
  • Hazard controls were not as effective as thought;
  • Work done was not as designed; or that
  • The actual use of a system was not as foreseen.

If we know the links to something that has changed – what it affects / what affects it – then we can begin to estimate the impact. From experience, this occupies a lot of our time in in-service safety management.

Recovery and Improvement

In the real world things rarely stand still. There are usually many different stimuli for change (we’ve already mentioned our incidents/accidents). Our enterprise might have to raise its game for several reasons:

  • The Regulator demanding change or improvement;
  • Customers asking for more performance or more assurance;
  • The public reaction to incidents elsewhere in the market;
  • New technology or new competitors in our industry; or
  • Our commitment to continuous improvement.

Pareto analysis tells us that a minority of causes tend to dominate the effect. Thus, a small number of causes or initiating events drive the occurrence of hazards. Similarly, a minority of hazards will dominate incident and/or accident statistics.

We may know this from experience or analysis of our specific system, or we may have only generic data. It doesn’t matter.

Using these insights, we can use the linkages in the HTS to target particular causes, events, conditions, scenarios, hazards, etc. We identify the set that (should) make the biggest impact, the biggest difference. We can then rank the contributors in order of importance and tackle them.

Again, long and sometimes bitter experience tells me that safety practitioners will spend a lot of time doing this. Reacting to stimuli is a big part of safety management.

The Tool Supports the Process

Of course, we should be using tools to support the process. (The process is designed to produce the results or outcomes that we need). One example of such is the Risk Assessment process from ISO 31000, below.

Shows the elements, progression and cycle of the Risk Assessment Process from ISO 31000
The Risk Assessment Process

We want our HTS to support this process, storing the data that we get from the risk identification, analysis, and evaluation activities. We also want our Hazard Log to provide information that enables communication and consultation as well as monitoring and review (perhaps using a risk matrix).

Other Functions

Hazard Logs and HTS also perform many other functions. These may appear mundane, but when they go wrong they suddenly become very exciting! What Have Hazard Logs Ever Done for Us? They help us avoid these unwanted excitements, by providing:

Questions? Comments? Send me your feedback in the comments, below.

My name’s Simon Di Nucci. I’m a practicing system safety engineer, and I have been, for the last 25 years; I’ve worked in all kinds of domains, aircraft, ships, submarines, sensors, and command and control systems, and some work on rail air traffic management systems, and lots of software safety. So, I’ve done a lot of different things!

Categories
Blog Safety Management

Optimizing Safety: Active Hazard Management with Hazard Logs

In ‘Optimizing Safety: Active Hazard Management with Hazard Logs’ we look at how to unleash the power of this underrated tool!

Introduction

A Hazard Log is more than just a record; it’s a dynamic tool for actively managing safety risks associated with systems. This continually updated log encapsulates Hazards, Accident Sequences, and Accidents, ensuring a structured approach to risk management. Dive into the world of Hazard Logs to discover their application, advantages, and best practices for effective use.

Active Management with Hazard Logs

Overview

A Hazard Log serves as an ongoing record, meticulously updated to capture Hazards, Accident Sequences, and Accidents linked to a system. It acts as a comprehensive repository, providing insights into risk management decisions for each Hazard and Accident.

The Hazard Log is a structured method of keeping and referring to Safety Risk Evaluations and other information pertaining to a piece of equipment or system. It is the primary means of monitoring the status of all identified hazards, choices made, and risk-reduction actions done, and should be utilised to assist supervision by the Project Safety Committee and other stakeholders.

Hazards, Accident Sequences, and Accidents noted are those that could potentially occur as well as those that have already occurred. The title Hazard Log may be deceptive because the information saved relates to the overall Safety Programme and includes Accidents, Controls, Risk Evaluation, ALARP/SFARP rationale, and Hazard data.

Utilization and Administration

Administered by a dedicated Hazard Log Administrator, primary access is granted to add, edit, or close data records. All other personnel have read-only access, ensuring visibility of Hazards while maintaining control. Records are tracked using a status field, indicating stages such as opening, awaiting mitigation confirmation, or ALARP/SFARP justification.

Recording Hazards

Considered best practice, each Hazard is recorded as “open,” with ALARP/SFARP arguments treated provisionally until mitigation actions are confirmed. Hazards are not deleted but closed with appropriate justifications, reflecting changes in relevance.

As an example, suppose the mitigation is contingent on the development of an operational procedure. This may not be developed until far after the Hazard has been discovered in the early stages of design or construction.

Hazards should not be erased from the Hazard Log, but rather closed and labeled “out of scope” or “not considered credible” with adequate justification. If such Hazards are no longer thought to be relevant to the system, the Log entry should be modified to reflect this.

Application in Systems

The Hazard Log should focus on a specified system, detailing its scope and safety requirements. It records the evaluation of Hazards, residual risk assessments, and recommendations for mitigation or formal acceptance with ALARP/SFARP justification.

Because a Hazard Log is an organised method of collecting and referencing data and records on Hazards, as well as documenting the Risk Evaluation and other information relevant to an equipment or system, unambiguous cross-referencing to supporting documentation is critical. The supporting documentation can be directly incorporated in the Hazard Log or cross-referenced.

Establishing a Hazard Log: Why and When

Traceability

A Hazard Log is crucial for projects, offering traceability in the decision-making process, and justifying the assessed Safety Risk. Initiated at the program’s earliest stage, it remains a live document throughout the system life cycle.

As modifications are implemented in the system, the Hazard Log should be updated to reflect the current design standard by including new or changed Hazards and the associated residual risks. The Hazard Log must be checked frequently to verify that hazards are being managed effectively and that compelling safety arguments in the Safety Case can be created.

Advantages & Disadvantages

Advantages

The Hazard Log is a traceable record of the Project’s Hazard Management process and thus:

  • Ensures that the Project Safety Programme uses a consistent set of Safety information;
  • Facilitates oversight by the Safety Panel and other stakeholders of the current status of the Safety activities; and
  • Supports the effective management of possible Hazards and Accidents so that the associated Risks are brought up to and maintained at a tolerable level;

Disadvantages

  • The Hazard Log could include information about the relationship between hazards, accidents, and their control through the establishment and fulfilment of Safety Requirements. However, if it is not robust or well-structured, this may obscure the identification and clearance of Hazards.
  • If Hazards are not well defined when they are entered into the Hazard Log, the rigour enforced by the need for a clear audit trail of changes made may make it very difficult to maintain the Hazard and Accident records most effectively. Before beginning data entry, an appropriate structure should be created and agreed upon.

Choosing the Right Format: Electronic vs. Paper-Based

Electronic Format

While a Hazard Log can be produced in any format, an electronic format, often in databases like Microsoft Access or SQL Server, ensures quick cross-referencing and traceability. Proprietary tools like Cassandra or spreadsheet packages like Microsoft Excel offer flexibility.

Bespoke vs. Proprietary

Choosing between a bespoke database and a proprietary tool involves considerations of customizability and standardization. A bespoke system may be simple to administer, while a proprietary tool ensures consistency across programs.

In conclusion, Hazard Logs, when actively managed, emerge as indispensable tools for maintaining safety standards and facilitating informed decision-making. Understanding their application and choosing the right format ensures efficient risk management throughout a system’s life cycle.

We will explore more active hazard management in our upcoming blog post using Cassandra as a case study.

That was ‘Optimizing Safety: Active Hazard Management with Hazard Logs’. See another article of my articles on hazard logs here. I hope that you find them useful: leave a comment, below!

My name’s Simon Di Nucci. I’m a practicing system safety engineer, and I have been, for the last 25 years; I’ve worked in all kinds of domains, aircraft, ships, submarines, sensors, and command and control systems, and some work on rail air traffic management systems, and lots of software safety. So, I’ve done a lot of different things!

Categories
Blog Safety Management

Hazard Logs and Hazard Tracking Systems

In this blog post and video ‘Hazard Logs and Hazard Tracking Systems’, I’m going to tell you about their benefits and features.

In many industries, we are required to create a hazard log: perhaps by a regulator, a customer, or a prime contractor. Or maybe it’s “just the way we do it round here”. Whatever the reason, many junior staff will be given responsibility for entering data into a hazard log.

Hazard Logs enable us to manage large amounts of safety data and references, but only if they are implemented properly. Unfortunately, it seems that there are an infinite number of ways of not doing them well. In my 25+ years in System Safety, I’ve seen many bad hazard logs, so I created this lesson to help you get the basics right.

Topics | Transcript | Questions

This is the trailer for the full, 35-minute lesson.

Topics

I’m going to be covering these topics, which are the most commonly asked questions:

  • What is a hazard log? (What is it what do we do with it?)
  • The key elements of a hazard log (what needs to be in it to make it work)?
  • Hazard Log management (what we need to do)?
  • What about hazard log tools? (What can we use to create a hazard log)?
  • What’s the difference between a hazard log and a risk assessment?
  • What’s the difference between a hazard log and a risk register?

Transcript

Hi everyone, and welcome to the Safety Artisan.

I’m Simon and today we’re going to be talking about Hazard logs and hazard tracking systems.

As I said, we’re going to look at hazard logs and hazard tracking systems and we’re going to be answering the most popular questions.

The most often asked questions about Hazard logs and Hazard Tracking Systems that you will find on the internet. So that’s what we’re going to answer.

And this is going to be the first of three sessions on this subject.

Side: Topics

Topics for this session. Right now commonly asked questions are:

  • What is a hazard log? What is it what do we do with it?
  • The key elements of a hazard log: What needs to be in it to make it work?
  • Hazard Log of management: What do we need to do?
  • What about hazard log tools? What can we use to create a hazard log?

Effectively now we’ll be looking at that in much more detail in sessions two and three. But we’ll just go over the basics today and then also, some very common questions:

  • What’s the difference between a hazard log versus a risk assessment? and
  • What’s the difference between a hazard log and a risk register?

And when I say Hazard Log, you can substitute [the phrase] hazard tracking system at all times. They’re really one and the same thing, which we will talk about.

[End of Trailer.]

See also a 10% free sample of the full video.

Related Articles

See also this info-post on Hazard Logs and there is another post to come on how a relational database can deliver a ‘Full Function’ Hazard Log.

My name’s Simon Di Nucci. I’m a practicing system safety engineer, and I have been, for the last 25 years; I’ve worked in all kinds of domains, aircraft, ships, submarines, sensors, and command and control systems, and some work on rail air traffic management systems, and lots of software safety. So, I’ve done a lot of different things!

Questions about HLs & HTS?

Ask me in the comments.

Categories
Blog Safety Management

Proportionality

Proportionality is about committing resources to the Safety Program that are adequate – in both quality and quantity – for the required tasks.

Introduction to Proportionality

Proportionality is a concept that should be applied to determine the allocation of resource and effort to a safety and environmental argument based on its risk.  It is a difficult concept to attempt to distil into a process as each Product, System or Service will have different risks, objectives, priorities and interfaces that make a ‘one size fits all’ approach impossible.

This section describes an approach that may be used to assist in applying the concept of proportionality; it seeks to guide you in understanding where a proportionate amount of effort can be directed, while at the same time maintaining the overriding principle that Risk to Life must be managed.  Regulators require that a proportional approach is used and there are many methods that try to achieve this.  Some focus on the amount of evidence needed to justify a safety argument; some provide more emphasis on the application of activities that are required to make a safety argument and some consider that fulfilling certain criteria can lead to an assessment of risk, but one requirement that is at the centre of any proportional approach is that safety risks are acceptable. 

A fundamental consideration of a proportional approach is considering compliance against assessment criteria.  The Health and Safety Executive’s view is that there should be some proportionality between the magnitude of the risk and the measures taken to control the risk. The phrase “all measures necessary” should be interpreted with this principle in mind. Both the likelihood of accidents occurring and the severity of the worst possible accident determine proportionality.  Application of proportionality should highlight the hazardous activities for which the Duty Holder should provide the most detailed arguments to support the demonstration [that risk is acceptable].

The following considerations may affect proportionality, in a defence context:

  1. Type of consequence;
  2. Severity;
  3. The stage in the Life cycle;
  4. Intended use (CON OPS/Design Intent);
  5. Material state (degradation);
  6. Historical performance;
  7. Cost of safety;
  8. Cost of realising risk;
  9. Public Relations;
  10. Persons at Risk:
    1. 1st,2nd,3rd Party;
    1. Military
    1. Civilian;
    1. Civil Servants;
    1. Contractors;
    1. General public;
    1. VIPs;
    1. Youths;
  11. Volume;
  12. Geographical spread/transboundary.

Some important points that should be noted regarding safety and environmental proportionality approach are that:

  1. Proportionality is inherent to safety and environmental risk assessment (i.e. use of ALARP, BPEO, etc.);
  2. Proportionality is explicitly linked to risk;
  3. Multiple factors need to be considered when deciding a proportional approach;
  4. ASEMS is the mandated safety and environmental framework; therefore, the framework should be applied; it is not possible to develop a proportional approach that negates any part of ASEMS.

Waterfall Approach Process

The model that should be used to consider a proportional approach is intended to provide guidance and should only be used by competent safety and environmental practitioners.  A degree of judgement should be used when answering questions, particularly where a Product, System or Service may easily be classified in more than one category; this is why the use of competent safety and environmental practioners is required.

The waterfall approach model categorises Product, System or Service risk in accordance with factual questions, presented on the left of the diagram below, which are asked about the intended function and operation.  Each question should be used to define the cumulative potential risk, which may be presented by the Product, System or Service.  The Product, System or Service is categorised into one of three risk bands, which align to those defined in the Tolerability triangle, presented in the right of of the diagram.

During the process two initial questions are asked, where an answer of “yes” will automatically result in a categorisation of high risk, regardless of the answer to subsequent questions.  Further refinement is required for lower risk systems to ensure that the system risk is categorised appropriately.

Figure 1, Proportionality Waterfall Approach Model

The diagram above depicts the proportionality waterfall approach model used for the application of ASEMS.

Adherence to ASEMS is mandatory for DE&S.  As such, it is not possible to develop a proportional approach that negates any individual part of ASEMS and so the procedures described in ASEMS Part 2 – Instructions, Procedures and Support should be followed;  where proportionality may be applied is within each General Management Procedure, Safety Management Procedure or Environmental Management Procedure for the allocation of resource, time or effort.

Once the risk category has been established guidance is defined which prescribes the rigour which should be applied to the safety assessment process in terms of Process, Effort, Competence, Output, Assurance (PECOA):

  1. Process – the amount of dedicated/specific process, level of intervention in the organisational structure the Safety and Environmental Management System are established;
  2. Effort – How much time is afforded to the management of risk;
  3. Competence – the level of competence that is required to conducted appropriate assessment and management of safety and environmental;
  4. Output – The detail of evidence and reporting is cognisant to the level of risk;
  5. Assurance – The level of assurance required which shall be applied to the process.

Guidance for the application of PECOA is provided in the table below.  It should be noted that this is indicative guidance for illustrative purposes only. It is a fundamental requirement of ASEMS safety management principles that all safety decisions made should be reviewed, assessed and endorsed by a Safety and Environmental Management Committee to ensure that the Products, Systems and Services categorisation is correct. The diagram below shows the process that may be applied:

Proportionality Process

It should be remembered that using this low/medium and high categorisation could be misleading as the model takes no account of the population or rate of occurrence of the harm. A simple system that can only cause minor injury could still have a high degree of risk if there are lots of people exposed to the risk and the accident rate was high.  Moreover, acceptance of such a situation could lead to the development of an ineffective safety culture or the bypassing of safety mitigation procedures in order to avoid a high accident/minor injury position.  This is where the application of competent safety and environmental advice is essential to ensure that any proportionality model is not slavishly followed at the expense of proper rigour.   Where this model is useful is assisting those safety and environmental professionals to perform a preliminary assessment regarding what Products, Systems or Services are a priority for the allocation of resource, time or effort.

Stage One – System type and Life Cycle Phase

The first question is used to indicate, at a high level, the likely degree of risk for a project.  It should be noted that this is not a definitive assessment and that Products, Systems or Services could move within the model as the safety or environmental evidence is assessed.  There will be a degree of pre-existing assessment which accompanies a Product, System or Service and this may be used to assist with this initial question. 

The safety and environmental assessment process should be closely aligned with the Product, System or Service development process for newly developed Product, System or Services.  Where Products, Systems or Services are in the Concept, Assessment, Development or Manufacture phase of the CADMID/T cycle, they should be accompanied by a safety and environmental assessment process which utilises quantitative assessment techniques.

Where a Product, System or Service sits in the CADMID/T cycle should not influence the rigour of any safety or environmental argument; this model is provided to assist with any determination of the resource, time or effort that may be applied to the evidence to support the argument.  All Risk to Life should be ALARP, with no exception; what changes is the allocation of resources, time and effort to reach that judgement.

Those Products, Systems or Services where the expected worst credible consequence results in, at worst, a single minor injury should automatically be categorised as LOW risk and a qualitative approach may be adopted.

Commercial Off The Shelf or Military Off The Shelf systems should be accompanied by evidence which may be used in the safety and environmental assessment to demonstrate that they are acceptably safe and environmentally compliant, particularly where these are manufactured for use in the EU, where each Product, System or Service should demonstrate compliance with the applicable EU standards.  That the Product, System or Service is Commercial Off The Shelf or Military Off The Shelf is not, in itself, evidence.

Such evidence should include test evidence, trials evidence or a certificate of conformance.  Where a Commercial Off The Shelf or Military Off the Shelf system is already in the in-service phase and it is established that there is sufficient evidence to form a compelling safety argument that the Risk to Life is ALARP, then the system should be categorised as MEDIUM-LOW.  Where the system is also non-complex then it may be categorised as LOW.

Such Commercial Off The Shelf or Military Off the Shelf evidence should only be relied upon where it is established that this evidence is sufficient to demonstrate that the system is acceptably safe and environmentally compliant and already in existence.  The degree and appropriateness of evidence should be established by a Safety and Environmental Management Committee, with particular emphasis upon the quality of the evidence for high-risk systems.  This approach should be undertaken if the Product, System or Service in its entirety is categorised as Commercial Off The Shelf or Military Off the Shelf.  Where only sub-systems or components are Commercial Off The Shelf or Military Off the Shelf, the Product, System or Service should be categorised as bespoke and assessed accordingly.

Stage Two – Risk estimation and System Complexity

Any estimation of the risk that a Product, System or Service is likely to present should be used to further refine its categorisation.  If the worst credible consequence of a Product, System or Service is multiple fatalities then that Product, System or Service should automatically be categorised as HIGH risk.

If the worst credible consequence is a single fatality or multiple severe injuries then the system complexity should be considered further to refine and inform the categorisation.  Complex or novel system designs should have a higher degree of Suitably Qualified Experienced Personnel to conduct the safety and environmental assessment.  Accordingly, those Products Systems or Services which are complex and novel should also be categorised as HIGH whereas those exhibiting a lower degree of complexity might be categorised as MEDIUM.

Notwithstanding this, those Products, Systems or Services thatare in the Concept, Assessment, Development or Manufacture/Termination phase of the CADMID/T cycle should still be supported by a quantitative safety and environmental process.  The only exceptions are those Products, Systems or Services where the worst credible consequence is a single minor injury.  These should be categorised as LOW risk and may be supported by a qualitative safety and/or environmental process.

LOW risk Products, Systems or Services were the worst credible consequence is at worst a single minor injury should be categorised as LOW-MEDIUM risk where the design is complex or novel, those exhibiting a lower degree of complexity should be categorised as LOW risk.

Once the risk category has been established the rigour which should be applied to the safety assessment process in terms of Process, Effort, Competence, Output, Assurance (PECOA) should be defined.  This is summarised below:

Program ScaleLifecycle Stage
Small scale or no Critical FunctionCADMID/TCADMID/TCADMID/T
Large Scale Capital,

Critical Function or bespoke
CADMID/TCADMID/TCADMID/T
AssessmentHighMediumLow
ProcessA rigorous quantitative safety and environmental assessment process should be applied.Consideration should be given to the application of a qualitative safety and environmental assessment process.  Functional safety/environmental assessment may be required, if identified as a risk control measure.A qualitative safety and environmental assessment process should be appropriate for low risk, low complexity systems.
EffortSignificant effort should be expended developing the safety and environmental case.A medium level of effort should apportioned to development of the safety and environmental case, increasing for newly developed systems.A medium level of effort should be apportioned to development of the safety and environmental case.
CompetenceThe safety and environmental assessment and assurance programme should be led by individuals who are experts.  Remaining personnel should be at least Practitioners who should be provided with oversight where appropriate.Personnel engaged in the safety and environmental assessment and approval should be at least practitioners.Personnel engaged in the safety and environmental assessment and approval should be at least supervised practitioners who should be provided with oversight where appropriate.
OutputA safety and environmental case should be developed which includes a safety argument.  The safety assessment process should be substantiated by quantitative evidence.A safety and environmental case should be developed, which should include a safety and environmental argument for all by simplex low risk systems.  The safety assessment process should be substantiated by quantitative evidence for newly developed systems.A safety and environmental statement may be considered for systems, which are low risk and complexity.
AssuranceThe safety and environmental assessment should be independently assured.Independent assurance should be considered and applied to those projects which are considered to be novel or complex.  Assurance may be conducted at Committee level. Independent assurance is not required.
ASEMS GuidanceSafety and Environmental   Dedicated tailored and full implementation of all Clauses, articulated through adherence to all GMPs, SMPs and EMPs.Safety and Environmental   Apply full implementation of all Clauses, in line with guidance provided for the Functional safety/environmental assessment, as required, if identified as a risk control measure and application of GMPs, SMPs and EMPs.Where Project Teams have an overarching Safety and Environmental Management Systems in place:   Safety Gather sufficient evidence to support safety argument and document in a Safety Case/Assessment in accordance with SMP 04050609 and 12     Environmental Gather sufficient information in order to produce Environmental Impact Statement in accordance with EMP 07 – Environmental Reporting.

Process

The type of safety and environmental process which should be applied is dependent both upon the Product System or Service categorisation and the phase of the CADMID/T cycle that the project is in.  Newly developed MEDIUM-LOW to HIGH category Products, Systems or Services which are in the Concept, Assessment, Development or Manufacture phase of the cycle should have a quantitative safety and environmental assessment process applied, the depth and rigour of the assessment should be proportionate to its classification.  LOW risk Products, Systems or Services where the worst credible consequence is anticipated to be no greater than one minor injury may be assessed qualitatively.

A qualitative safety and environmental assessment process should be applied to Products, Systems or Services, which are in the In-Service, Disposal/Termination phase where it is deemed that there is sufficient evidence already in existence to demonstrate that it is acceptably safe.  In these circumstances a qualitative safety and environmental process should be applied to assess the in-service risks.

The approach uses a systematic and logical approach to categorise the resource, time and effort required to support any argument that a Product, System or Service is acceeptably safe or provides no significant damage to teh environment.  It also advocates the application of ASEMS in its entirety, prescribing the level of rigour, which should be applied in terms of process, effort, competence, output and assurance.

Effort

The effort apportioned to the safety and environmental process should be proportionate to the classification of the system.  A significant amount of rigour should be applied to those projects requiring quantitative assessment processes, particularly those with the highest degree of risk and complexity.

If a Product System or Service is assessed to be in a particularly low category and is simple it may not be necessary to undertake the full scope of risk management procedures.  In these circumstances a certificate of conformance may be sufficient, which may be supported by statement to that effect from the Safety and Environmental Management Committee.

All decisions made regarding the evidence required to justify a safety argument (regardless of risk) should be endorsed by a Safety and Environmental Management Committee.  If this is decision is delegated further for those Products, Systems or Services that are low risk is for the Duty Holder to determine as all decisions regarding to Risk to Life are made on their behalf.

Competence

The safety and environmental lead should be an expert for HIGH category projects or for MEDIUM category projects where the Product System or Service is particularly complex or a novel design.  The remaining personnel engaged on such projects should be at least practitioner level.  A competency assessment should be undertaken which should be endorsed by a Safety and Environmental Management Committee.

The safety and environmental lead for MEDIUM category projects should be at least practitioner level.  The remaining personnel engaged on such projects should be practitioner or supervised practitioner where appropriate supervision is in place.  A competency assessment should be undertaken which should be endorsed by a Safety and Environmental Management Committee.

The safety and environmental lead for LOW category projects should be at least practitioner level or a supervised practitioner with appropriate supervision in place.

Competency requirements relating to specific safety and environmental processes defined in ASEMS should be applied where those processes are undertaken.

Output

A safety and environmental case should be developed for HIGH category projects which includes a safety and environmental argument, developed using Claims Arguments Evidence (CAE) or Goal Structuring Notation (GSN).  The argument should be substantiated by quantitative evidence such as reliability data or the output from quantitative safety assessment processes.

A safety and environmental case should be developed for MEDIUM category projects which includes a CAE or GSN safety argument.  The quality and depth of evidence required to substantiate the safety and environmental argument should be proportionate to the classification of the Product System or Service.   Products, Systems or Services with increased complexity or higher degrees of risk should be substantiated by quantitative evidence

A Safety and environmental case should be developed for MEDIUM-LOW category Products, Systems or Services.  A safety and environmental argument should be included for those Products, Systems or Services which are particularly complex or novel or those which exhibit an increased degree of risk

A Safety and environmental case should be developed for MEDIUM-LOW category Products, Systems or Services.  A safety and environmental argument should be included for those Products, Systems or Services which are particularly complex or novel or those which exhibit an increased degree of risk.

A safety and environmental case or Safety and environmental statement should be developed for LOW category Products, Systems or Services.  A certificate of conformance may be adequate for the lowest risk simple Products, Systems or Services

All decisions made regarding the evidence required to justify a safety argument (regardless of risk) should be endorsed by a Safety and Environmental Management Committee.  If this is decision is delegated further for those Products, Systems or Services that are considered to fall in the low category, then it is for the Duty Holder to determine (as all decisions regarding to Risk to Life are made on their behalf) whether to acept the risks or not.

Assurance

HIGH and MEDIUM category projects should be independently reviewed by a Safety and Environmental Auditor.  The degree of Independent Safety and Environmental Auditor engagement should be proportionate to the project categorisation.

MEDIUM-LOW category projects should be independently reviewed by a Safety and Environmental Auditor where the safety and assessment processes applied are novel or complex.  Justification should be provided where an Independent Safety and Environmental Auditor is not appointed.

It is not necessary for projects categorised LOW to be independently reviewed.

It should be remembered that it is not prudent to take any form of autocratic system or approach without sufficient validation, verification and endorsement by competent and duly authorised individuals, who are considered Suitably Qualified and Experienced Personnel for the role.  Endorsement of decisions should be made by a competent panel or committee, as part of the overall hazard analysis and risk assessment and any variation in opinion from that presented by any proportionality model should be managed by such a panel.

If you found this post on Proportionality helpful, please leave a review.

If this post is missing something you wanted, please let me know!

Categories
Blog Safety Management

Hazard Logs – a Brief Summary

In Hazard Logs – a Brief Summary, we will give you an overview of this important safety management tool. This post serves as an introduction to longer posts and videos (e.g. Hazard Logs & Hazard Tracking Systems), which will provide you with much more content.

Hazard Logs – a Brief Summary

Description of Hazard Log

A Hazard Log is a continually updated record of the Hazards, Accident Sequences, and Accidents associated with a system. It includes information documenting risk management for each Hazard and Accident.

The Hazard Log is a structured means of storing and referencing Safety Risk Evaluations and other information relating to a piece of equipment or system. It is the principal means of tracking the status of all identified Hazards, decisions made and actions undertaken to reduce risks. It should be used to facilitate oversight by the Project Safety Committee and other stakeholders.

The Hazards, Accident Sequences, and Accidents recorded are those which could conceivably occur, as well as those which have already been experienced. The term Hazard Log may be seen as misleading since the information stored relates to the entire Safety Programme and covers Accidents, Controls, Risk Evaluation, and ALARP/SFARP justification, as well as data on Hazards.

Operation

The Hazard Log is maintained by a Hazard Log Administrator, who is responsible to the Project Safety Engineer/Manager. The Hazard Log Administrator has primary access to the Hazard Log allowing him/her to add, edit or close data records. All other personnel requiring access to the Hazard Log are normally allowed read-only access. This allows for visibility of Hazards to all but limits the control/administration of data records to the Hazard Log Administrator.

Records can be tracked by the use of a status field. This, for example, identifies whether the record has just been opened, is awaiting confirmation of mitigation actions, or is ALARP/SFARP.

It is best practice for the Hazard Log to record each Hazard as “open” and for ALARP/SFARP arguments to be provisional until all mitigation actions are confirmed to be satisfactorily completed. An example is where the mitigation depends upon the production of an operational procedure that may not be written until well after the Hazard is first identified in the early stages of design or construction.

Hazards should not be deleted from the Hazard Log, but closed and marked as “out of scope” or “not considered credible”, together with appropriate justification. Where such Hazards are no longer considered relevant to the system, the Log entry should be updated to reflect this.

Application

In general, the Hazard Log should relate to a specified system and record its scope of use, together with the safety requirements. When Hazards are identified, the Hazard Log should show how these Hazards were evaluated and note the resulting residual risk assessment; the Hazard Log should then record any recommendations for further action to mitigate the Hazards, or formally document acceptance of the Hazards and any ALARP/SFARP justification.

Since a Hazard Log is a structured way of storing and referencing data and records on Hazards, documenting the Risk Evaluation and other information relating to a piece of equipment or system, clear cross-referencing to supporting documentation is essential. The supporting documentation can be either directly embedded or cross-referenced within the Hazard Log.

When it Might be Used

A Hazard Log should be established for all projects. This will allow full traceability of the formal decision process which would justify the assessed level of Safety Risk.

The Hazard Log is established at the earliest stage of the program and should be maintained throughout the system life cycle as a “live” document or database. As changes are integrated into the system, the Hazard Log should be updated to incorporate added or modified Hazards and the associated residual risks noted to reflect the current design standard.

It is essential that the Hazard Log is reviewed at regular intervals, to ensure that Hazards are being managed appropriately and enable robust safety arguments in the Safety Case to be established.

Advantages, Disadvantages, and Limitations

Advantages 

The Hazard Log contains the traceable record of the Hazard Management process for the Project and therefore:

  • Ensures that the Project Safety Programme uses a consistent set of Safety information;
  • Facilitates oversight by the Safety Panel and other stakeholders of the current status of the Safety activities;
  • Supports the effective management of possible Hazards and Accidents so that the associated Risks are brought up to and maintained at a tolerable level;
  • Provides traceability of Safety decisions made.

Disadvantages 

  • The relationship between Hazards, Accidents, and their management through setting and meeting Safety Requirements could be included within the Hazard Log. However, if it is not sufficiently robust or well-structured, this may obscure the identification and clearance of Hazards;
  • If Hazards are not well defined when they are entered into the Hazard Log, then the rigor enforced by the need for a clear audit trail of changes made may make it very difficult to maintain the Hazard and Accident records in the most effective way. An appropriate structure should therefore be designed and agreed upon before data entry starts.

Comments

A Hazard Log can be produced in any format, but an electronic format is the most common, as this tends to provide the quickest means of cross-referring and providing traceability through the Hazard Log. A paper-based Hazard Log would have limitations for most defense Systems as it would become large, staff-intensive, and cumbersome as the System developed. This in turn introduces a significant maintenance overhead for a project.

The electronic form of the Hazard Log can be developed using Database development tools like Microsoft Access or SQL Server. Alternatively, you can use an existing application such as DOORS. Alternatively, it can be completed in a simple spreadsheet package such as Microsoft Excel. The UK Ministry of Defence’s preferred Hazard Log tool was Cassandra, a proprietary Database based upon Microsoft Access.  (We will use Cassandra as an example in another blog post.)

A bespoke Database enables the originator to custom define fields appropriate to the System. Conversely, a proprietary tool allows for a consistent and standardized approach across a range of programs. A bespoke system may be relatively simple to administer and manipulate, whereas a proprietary tool may require external training. Widespread use of different bespoke solutions may become unmanageable.

Sources of Additional Information

Additional guidance on the Hazard Log can be found within the following references: MoD’s Project-Oriented Safety Management System – procedure SMP11 – Hazard Log.  An example Hazard Log structure is also presented there.

Copyright Acknowledgement

In this article, I have used material from a UK Ministry of Defence guide. It is reproduced under the terms of the UK’s Open Government Licence.

Hazard Logs – a Brief Summary: Ask Me Anything!

Categories
Blog Safety Management

Safety Management Policy

In this post on Safety Management Policy, we’re going to look at the policy requirements of a typical project management safety standard. This is the Acquisition Safety & Environmental System (ASEMS).

The Ministry of Defence is the biggest acquirer of manufactured goods in the UK, and it uses ASEMS to guide hundreds of acquisition projects. They will range from the development of large, complex systems to buying simpler off-the-shelf items.

(You may be aware that the UK Ministry of Defence has a terrible record of project failure. I have personal experience of working on both sides of contracts – for buyer and seller. I can tell you that they would have done better if they had followed ASEMS more carefully. The standard is good, but no standard can help if you don’t use it!)

The policy clauses listed here are typical of many found around the world. There is a lot to be learned by studying them.

Safety Management Policy – Overview

ASEMS Part 1 – Policy comprises a series of policy statements grouped in six loosely related sections as follows:

Part 1 – General Clauses

These clauses represent those overarching general requirements that shall be used in all instances. If the clause is self-explanatory, there may not be explicit Instructions in ASEMS – Part 2 Instructions, Guidance, and Support to support them but where these are provided, the Instructions and Guidance will provide a best practice method for compliance.

Clause 1.1 Conform to Secretary of State for Defence’s Policy

Those holding safety and environmental protection delegations shall ensure that in the procuring or supporting Products, Systems, or Services, they conform to the Secretary of State’s Health, Safety, and Environmental Protection Policy Statement.

Clause 1.2 Instructions

The instructions defined in ASEMS – Part 2 Instructions, Guidance, and Support shall be used to manage safety and environmental impact within the Enterprise.

Clause 1.3 Duty Holders

Duty Holders shall be appointed and Letters of Delegation issued in accordance with the Enterprise Chief Executive Officer’s Organisation and Arrangements.

Clause 1.4 Interfaces

Interfaces between organizations shall be identified so that risks across them can be appropriately managed and effectively communicated.

Clause 1.5 Data and Record Format

Data shall be maintained in a format, which satisfies the reporting requirements of senior management within the Enterprise. Auditable records shall be made and kept under review in accordance with relevant legislation.

Clause 1.6 Significant Occurrences and Fault Reporting

All Delivery (Project) Teams shall record and report significant Product, System, or Service faults, accidents, incidents, and near misses to the Enterprise Safety, Health & Environment Committee through the Quality, Safety, and Environmental Protection Team.

Clause 1.7 Learning From Experience

Business Units, Delivery (Project) Teams, or equivalents shall ensure accidents and incidents are investigated to identify opportunities to reduce the likelihood and impact of recurrence. Lessons learned shall be shared amongst all relevant stakeholders to maximize benefit.

Clause 1.8 Training

Enterprise-sponsored courses for system safety and environmental protection shall be the recognized route for achieving suitable and sufficient competence throughout the Enterprise.

Part 2 – Management Responsibilities

Management responsibilities for safety and environmental protection permeate through every Clause, and are the heart of any successful safety and environmental management system; however, these Clauses confer specific requirements upon management and make compliance easier to measure.

Clause 2.1 Organisation and Arrangements

Business Unit Directors or equivalent shall document their Organisation and Arrangements that shall communicate their commitment to the Secretary of State for Defence’s policy statement, continual improvement, positive safety and environmental culture, to minimize adverse effects on the environment, and comply with legal and other appropriate requirements.

Clause 2.2 Communication

Business Units, Delivery (Project) Teams, or equivalents shall ensure that communication procedures are implemented that provide an effective flow of safety and environmental protection information upwards, downwards, and across their organization.

Clause 2.3 Organisational Change Management

Business Unit Directors or equivalent shall identify any increased safety risk associated with organizational change and manage it appropriately.

Part 3 – Safety and Environmental Management System

These Clauses place specific requirements upon organizations and individuals and represent the minimum requirements for a safety and environmental management system. They include the requirement to plan for safety and environmental protection, to enact that plan, check that the plan is working, and to make changes where necessary to improve the system

Clause 3.1 Safety and Environmental Management System

Business Units, Delivery (Project) Teams, or equivalents shall operate in compliance with established Safety and Environmental Management Systems.

Clause 3.2 Safety and Environmental Management Plan

Business Units or equivalent shall ensure that all Products, Systems, or Services have a suitable and sufficient through-life safety and environmental management plan.

Clause 3.3 Stakeholder Agreements

Agreements between Stakeholders shall define and document system safety and environmental protection responsibilities.

Clause 3.4 Availability of Resources

Business Units, Delivery (Project) Teams or equivalents shall ensure the availability of resources necessary to establish, implement and maintain the safety and environmental management system and detail these in a through-life safety and environmental management plan.

Clause 3.5 Core Element Documentation

Business Units, Delivery (Project) Teams or equivalents shall establish, maintain and retain suitable and sufficient information that describes the core elements of the safety and environmental management system(s), their interaction, and any related documentation.

Clause 3.6 Accountability

Individuals deployed to assignments that require the formal delegation of safety and environmental responsibilities, accountabilities, and authority shall be mapped against, and comply with, the requirements of the Enterprise Acquisition Safety taxonomy.   

Clause 3.7 Monitoring

Business Units, Delivery (Project) Teams or equivalents shall establish, implement and maintain a suitable and sufficient procedure to monitor and measure safety and environmental performance of their safety and environmental management system on a regular basis.

Clause 3.8 Audit Frequency

Compliance with the documented safety and environmental management system shall be verified via audit at planned intervals according to a published schedule, and as required.

Clause 3.9 Internal Audit

At planned intervals commensurate with the risk:

  1. Business Units shall audit their Delivery (Project) Teams, or equivalents, safety, and environmental management systems.
  2. Delivery (Project) Teams or equivalents shall audit the safety and environmental management systems of their projects.
  3. The Enterprise Quality, Safety, and Environmental Protection Team or their representative, shall audit the safety and environmental management systems of Business Units and Delivery (Project) Teams.

Policy Clause 3.10 Review

Business Units, Delivery (Project) Teams, or equivalents shall review their safety and environmental management systems, at planned intervals commensurate with the risk, to ensure their continuing suitability, adequacy, and effectiveness.

Part 4 – Safety and Environmental Cases/Assessments

These Clauses contain the requirements that each safety and environmental case/assessment shall contain. Defense Regulators may require further, additional, requirements to what is contained in these clauses. Adherence to these Clauses will ensure safety and environmental cases/assessments contain the minimum evidence necessary to support safety and environmental arguments that Products, Systems, and Services are safe to use.

Clause 4.1 Safety Cases

Delivery (Project) Teams or equivalents shall establish and maintain through-life safety cases that provide a compelling, comprehensible, and valid argument that a Product, System, or Service is safe for a given application in a given operating environment.

Clause 4.2 Environmental Cases

Delivery (Project) Teams or equivalents shall establish and maintain through-life environmental cases that provide a compelling, comprehensible, and valid argument that the environmental impact of a Product, System or Service is reduced, or Best Practicable Environmental Option (BPEO) is applied.

Clause 4.3 Identification of Legislation and other Requirements

Business Units or equivalent shall establish and maintain a procedure for identifying and accessing the relevant safety and environmental legislative and other requirements that are applicable to their projects.

Clause 4.4 Legislation Compliance and other Requirements

Delivery (Project) Teams or equivalents shall establish, and demonstrate compliance with, relevant legislation and other requirements.

Clause 4.5 Environmental Impact Identification

Business Units, Delivery (Project) Teams or equivalent shall establish, implement and maintain a procedure for the on-going proactive identification of environmental impacts.

Clause 4.6 Safety Hazard Identification

Business Units, Delivery (Project) Teams or equivalent shall establish, implement and maintain a procedure for the on-going proactive identification of safety hazards.

Clause 4.7 Safety and Environmental Objectives and Targets

Business Units, Delivery (Project) Teams or equivalents shall establish and maintain relevant safety and environmental objectives with a resourced programme to achieve targets.

Clause 4.8 Accident and Incident Records

Business Units, Delivery (Project) Teams or equivalent shall monitor and record accidents, incidents and near misses, where the performance of their Product, Systems or Services results in harm to individuals or damage to the environment and use this information to keep their risk assessments valid.

Clause 4.9 Assessment Approval

Safety and environmental case reports shall be personally approved by the individual with formally delegated authority to confirm their acceptance with the progress of the safety case/assessment and of the risks associated with the project.

Clause 4.10 Independent Assurance

Independent review of the Safety and Environmental Management System shall be ensured, as appropriate and commensurate to the risk, by the individual with formally delegated authority for safety and environmental protection.

Part 5 – Risk management

Risk Management is an essential function of safety and environmental protection and these Clauses reflect that importance. They set both general safety and environmental protection standards and specific the Enterprise requirements that support the need for assurance and performance monitoring to the Defence Board. The requirement to refer risks through Line management is included here.

Clause 5.1 Risk and Impact Assessment

All foreseeable Safety Risks and Environmental impacts shall be identified, assessed, prioritised and managed.

Clause 5.2 Change Management

Business Units, Delivery (Project) Teams or equivalents are to ensure that all new or increased safety risks arising from changes to Products, Systems or Services or to their operating environment are managed appropriately

Clause 5.3 Hierarchy of Controls

Business Units, Delivery (Project) Teams, or equivalent shall adopt a recognized hierarchical approach for achieving a reduction in safety risk and environmental impact.

Clause 5.4 Consultation

Business Units, Delivery (Project) Teams, or equivalent shall ensure that all stakeholders are identified and consulted so that their views and responsibilities are considered when managing safety and environmental risks.

Clause 5.5 Safety Risk

Products, Systems or Services shall not have safety risks that have not been formally assessed, justified and declared to be Tolerable and As Low As Reasonably Practicable (ALARP), unless communicated and accepted by a Duty Holder (DH).

Clause 5.6 Environmental Impact

Significant environmental impacts shall be minimised utilising BPEO.

Clause 5.7 Non-compliance Reporting

In circumstances where the ability of the Delegation Holder to achieve compliance with the requirements of ASEMS may have been compromised, Business Units, Delivery (Project) Teams or equivalents shall take immediate steps to correct the situation. Actions required could include improving the clarity of the authority, instructions or responsibilities provided, increasing resources or correcting deficiencies in practices or procedures. Where resolution of the problem lies outside the control of the Delegation Holder, the issue is to be referred through the line management chain. This requirement is to be applied to any further levels of delegation as necessary.

Clause 5.8 Referral Requirements

Where risks cannot be managed within an individual’s delegated responsibility, the risk shall be formally referred using the Enterprise Risk Referral procedure.

Part 6 – Competence

It is necessary that those involved in safety and environmental protection are suitably qualified and experienced in order for them to perform their roles. These Clauses detail the way that competence is to be captured and assessed.

Clause 6.1 Roles and Responsibilities

Business Units, Delivery (Project) Teams or equivalents shall demonstrate that competence requirements have been established for all roles in accordance with appropriate standards including the Enterprise System Safety & Environmental Protection Competency Maps, Assignment Specifications, and Success Profiles.

Clause 6.2 Suitably Qualified and Experienced Personnel

Business Units, Delivery (Project) Teams or equivalents shall ensure that those engaged in safety and environmental protection are suitably qualified and experienced to discharge their safety and environmental responsibilities.

Clause 6.3 Competence

The competence of all staff with system safety and environmental responsibilities shall be regularly assessed, monitored, and recorded.  Staff with formally delegated system safety and environmental responsibilities shall demonstrate their competence to receive the delegation prior to deployment, and their competence shall be regularly monitored and recorded. 

Safety Management Policy: which clauses will you use?

Categories
Blog Safety Management

The Risk Matrix

In this article, I look at The Risk Matrix, a widely used technique in many industries. Risk Matrices have many applications!

In this article, I have used material from a UK Ministry of Defence guide, reproduced under the terms of the UK’s Open Government Licence.

Introduction

A risk matrix is a graphical representation of the various risks associated with a project and its corresponding risk management strategies. It helps to identify and prioritize potential risks.

What is a Risk Matrix?

A safety risk matrix provides a framework for ranking or classifying safety issues according to their significance. The matrix is sometimes called a “hazard ranking matrix” or a “hazard classification matrix”, but it is strictly applied to accidents, since these have harmful outcomes, whereas hazards only have the potential for harm. The matrix can be used as a risk screening tool to help decide which issues need treatment first or which need not be considered further at this time.

Risk matrices can cover exposure to different types of loss, including harm to humans, damage to the environment, financial loss or impact on reputation. If a loss in these diverse categories can be considered in common terms (e.g. the monetary impact of all types of loss), then a single matrix can cover all such issues together and prioritize which are the most significant.

The matrix covers a “risk space” defined by the two component parts of risk, namely likelihood on one axis and consequence (or severity) on the other. Each axis must span the full range of outcomes, which are considered possible for the system of interest. Each range is divided into a number of categories or bands (typically between 3 and 8) to define the cells of the matrix.

The bands on the two axes may be defined in terms that are purely qualitative, semi-quantitative, or fully quantitative, for example:

  • Qualitative:
    • Likelihood is (Frequent/Reasonably Probable/Remote/Extremely Remote)
    • Severity is (Minor/Significant/Severe/Catastrophic)
  • Semi-quantitative:
    • Likelihood is (e.g. likely to occur once per year on one site)
    • Severity is (e.g. a single death)
  • Quantitative:
    • Likelihood is (e.g. between 1×10-4 and 1×10-5 per year on one site)
    • Severity is (e.g. between 1.0 and 10.0 Fatalities and Weighted Injuries)

Each cell of the matrix is assigned an indicator defining the relative significance of issues falling in that zone. This indicator could be:

  • A risk descriptor (e.g. Low, Moderate, High, Very High)
  • A risk score or index (e.g. a number from 1 to 20)
  • A priority category (e.g. High, Medium or Low)
  • A risk class (e.g. A, B, C or D)
  • A measure of expected rate of harm or loss (e.g. 5.4 Fatalities and Weighted Injuries per year or £45,000 per year)

Where likelihood and consequence are stated quantitatively, the axes are usually considered to have logarithmic scales. Adjacent bands will typically differ by one order of magnitude. In this case, lines of constant risk run diagonally across the matrix and the risk will range by a factor of 100 across the area covered by a single cell. This illustrates that the matrix is a coarse tool, which can show large differences in risk, but does not address fine detail, such as compliance with quantitative risk requirements.

To apply the matrix, users must have a list of the relevant safety issues (from Hazard Identification and Hazard Analysis) and estimates of the likelihood and severity of each possible accident (from Risk Estimation). The matrix is therefore a technique for Risk Evaluation, which follows on from Risk Estimation. The estimates of accident likelihood and severity may be generated by different methods, depending on the stage of the project, the information available and the significance of the safety issue being explored. For example, the estimates may come from:

  • Engineering judgement by Subject Matter Experts with knowledge of similar systems
  • Historical data from this or similar systems
  • Detailed modelling (e.g. using Fault Tree Analysis and Event Tree Analysis or Bow-Tie Analysis)

Examples of Risk Matrices

The following example matrices show some of the variations in format, terminology and risk indicators across a range of sectors and standards.

Example 1: IEC 31010 Example risk ranking matrix. Severity on x-axis increasing left to right, likelihood on y-axis increasing bottom to top, with five “risk levels” which are linked to decision rules such as the level of management attention or the time scale by which response is needed.

IEC 31010 Risk Matrix

Example 2: Def Stan 00-56 Issue 2 Example accident risk classification table. Severity on x-axis increasing right to left, likelihood on y-axis increasing bottom to top, four risk classes identify significance and so management level for approval.

 CatastrophicCriticalMarginalNegligible
FrequentAAAB
ProbableAABC
OccasionalABCC
RemoteBCCD
ImprobableCCDD
IncredibleCDDD
Def Stan 00-56 Issue 2 Example Accident Risk Classification Table

Example 3: IMO Guidelines on FSA. Example hazard risk index matrix. Severity on x-axis increasing left to right, likelihood on y-axis increasing bottom to top, risk index (RI) in each cell calculated by adding Severity Index (SI) for column and Frequency Index (FI) for a row. RI can be considered as log(risk), obtained by adding FI and SI.

FIFrequencySeverity (SI)
1234
MinorModerateSeriousCatastrophic
7Frequent891011
6 78910
5Reasonably probable6789
4 5678
3Remote4567
2 3456
1Extremely remote2345
IMO Guideline on FSA: Risk Ranking Matrix

Example 4: ISO 17776 Offshore Sector Example risk matrix. Severity on y-axis increasing top to bottom, likelihood on x-axis increasing right to left to top, matrix areas define future action to be taken.

ISO 17776 Risk Matrix

Risk Matrix Assessment

When it Might be Used

The matrix is usually set up at an early stage of the lifecycle, defining the framework to be used for risk evaluation at subsequent stages. It should be used early in the lifecycle to provide a coarse sift of the identified safety issues so that attention can be focused on the most significant ones. This attention may involve more detailed analysis to understand complex accident sequences and to apply semi-quantitative or fully quantitative risk assessment techniques where appropriate.

Later in the lifecycle, the risk matrix may be used for determining the appropriate management level for review and acceptance of each safety issue. This ensures that the key risk drivers are brought to the attention of senior managers but they are not swamped with masses of information on less significant matters.

During the in-service stage of the lifecycle, the risk matrix technique can be applied to give an indication of significance for new safety concerns, such as those revealed by incidents or due to proposed design changes. Risk monitoring can be focused on the issues of highest significance as well as targeting resources for risk reduction.

Advantages & Disadvantages

Advantages

  • Risk matrices provide a quick appreciation of the most significant issues so that attention can be focused where it will have most benefit.
  • Matrices provide a visual representation which is easily understood and so aids communication with non-specialists.
  • Risk matrices can cover impacts which are different in nature (e.g. harm to people, harm to the environment, material or financial loss), provided that these can be equated in common units (e.g. in money terms).

Disadvantages

  • Risk matrices are good for examining different issues affecting one system or activity on the basis of their risk relative to each other. They are not effective for understanding absolute risk.
  • There is no single, correct interpretation of the level at which “safety issues” should be selected for presentation on the risk matrix. This means that different analysts may choose different levels and the resulting list of prioritised issues is somewhat subjective. The apparent results may be changed by “accident splitting” (i.e. defining one safety issue as two or more different accidents, each of which will appear to have lower risk).
  • Risk matrices consider safety issues one at a time and so do not help understanding the overall or aggregate risk exposure.
  • When a variety of different outcomes is possible from a single issue (e.g. fire – consequences can range from no harm to multiple deaths) it can be difficult to choose which likelihood and consequence combination should be used.
  • As a broad-brush technique, risk matrices should not be used for considering whether quantitative risk targets have been met or as the only technique for examining complex or high consequence issues. The matrix can, however, highlight high consequence issues so that they then receive more detailed consideration.

Risk Matrices for Project Management

In project management, we are aiming for specific outcomes, often represented as the project management triangle.

Project Management Triangle

In the center is quality (and/or safety), which is central to indicate that this cannot be compromised.  The three corners are cost, time, and scope (or requirements), and these can be traded off against each other.

This representation helps us to identify project risks by the effect that they might have on the project’s objectives.  ISO 31000 defines risk as “the effect of uncertainty on objectives”.  Again, the risk matrix allows us to identify and rank risks, identifying the biggest, most critical risks.  These risks are where we will focus most attention, looking for multiple controls, or defense-in-depth, for the most serious ones.   

An old saying is that “you can have a quick job, a proper job, or a cheap job; you can have two out of three, but you can’t have all three.”  Taken literally this is a little pessimistic, but it does remind us that if we set an absolute target on one of these axes, then we will likely have to trade the other two off against each other.   

This axiom also gives us some basic principles on which to identify controls.  We might desire controls that allow us to achieve all objectives at the same time, but this is often unrealistic.  Practical experience – encoded in a saying – suggests that we must be prepared to accept some trades in budget/schedule/scope.

Thus the risk matrix, in combination with some basic project management principles, enables more realistic decision-making.  (Real decisions involve saying ‘no’ to some things in order to say ’yes’ to others.)  Rather than naively thinking that we can have it all, the risk matrix supports robust early decision-making. 

This should make project success more likely – until somebody changes the objectives!

Additional Considerations

It should be noted that risk matrices from different standards and industry sectors are not always represented in the same way. The most common convention has a Cartesian representation (i.e. values increasing left to right and bottom to top on the two axes) so that risk increases from bottom left to top right, but the examples below show that several common matrices have a different format.

If risk estimates are generated by a team of Subject Matter Experts, their deliberations can be biased (consciously or unconsciously) if they know the risk matrix framework. There may be a tendency to choose likelihood and/or severity estimates that result in a lower apparent risk so that it attracts less management scrutiny.

Uncertainty of the estimates of severity and likelihood can be represented on a risk matrix by showing that risk with error bars rather than a single point. This can help understanding by senior managers.

Using common matrices for different systems does not necessarily result in risk estimates that can be compared in a meaningful way. The systems may have diverse risk exposure factors (e.g. number of people exposed, usage rate) and different numbers and types of accidents to consider.

(For more on risk management, see the FAQ.)

Do You Use a Risk Matrix in Your Work?

Categories
Blog Safety Management

Risk: Averse, Adverse, or Appetite?

You heard me right. Risk: Averse, Adverse, or Appetite? Which would you choose? Do we even have a choice? Read on …

We often hear that we live in a risk-averse society.  By that, I mean that we don’t want to take risks, or that we’re too timid.  I don’t think that’s the whole story.

In reality, we need to deal with several concepts.  Let’s start by looking at risk:

  • Aversity;
  • Adversity;
  • Appetite; and then
  • Perception.

Risk Adverse versus Risk Averse

These terms are often used incorrectly, so here’s a useful comparison:

Many people are confused when faced with the choice between adverse and averse.  While these two adjectives have many similarities, they are not used interchangeably.
If you want to describe a negative reaction to something (such as a harmful side effect from medication) or dangerous meteorological conditions (such as a snowstorm), adverse is the correct choice. You would not say that you had an ‘averse’ reaction to medication or that there was ‘averse’ weather.
In short, adverse tends to be used to describe effects, conditions, and results; while averse refers to feelings and inclinations.”[1]

Merriam-Webster Dictionary

Risk Adverse

A Formal Definition of Adverse

Again, the Merriam-Webster Dictionary sails to the rescue:

  • 1: acting against or in a contrary direction:
    • HOSTILE,
    • hindered by adverse winds
  • 2a: opposed to one’s interests,
    • an adverse verdict,
    • heard testimony adverse to their position,
    • especially: UNFAVORABLE,
    • adverse criticism
  • b: causing harm: HARMFUL, adverse drug effects
  • 3: archaic: opposite in position”[2]

This is all very well, but we need something that we can use, like a…

…Practical Definition of Risk Adverse

The Law Insider website provides a very useful definition of ‘Risk Adverse’.   

“Adverse Risk means any risk of an adverse effect on the Development, procurement or maintenance of Regulatory Approval, Manufacture or Commercialization of a Product.”[3]

Law Insider

It’s useful because it is so pertinent to safety.  Let me explain. Often, we want to develop a product or service, but there are:

  • Development risks – often called Project Management risks, as a development is often the focus of a project.  Remember that the ISO 31000 defines risk as “the effect of uncertainty on objectives”.  By definition, a project has specific objectives (e.g., budget, schedule, and quality). 
  • Procurement risks – when acquiring a new product or service and enterprise may also acquire development risks, for the new or upgraded thing.  There are also risks associated with contractual acceptance, fielding the product, etc.
  • In many industries and domains, regulatory approval may be needed.  This may require qualification, certification, or accreditation (or a combination thereof).
  • Commercialization risks include making a product commercially viable, positioning it in the market, and gaining user and/or public acceptance.     

Each one of these topics is a massive subject, about which countless books have been written.  Law Insider’s definition is very powerful!

Risk Averse

So, risk aversion is about feelings and inclinations.  This is such a familiar topic, that perhaps we don’t bother to explore it. Later on in this post, we will explore Risk Aversion by looking at Risk Perception.

Before we do that, let’s look at the opposite of Risk Aversion.

Risk Appetite

“Risk appetite is the level of risk that an organization is prepared to accept in pursuit of its objectives, before action is deemed necessary to reduce the risk. It represents a balance between the potential benefits of innovation and the threats, that change inevitably brings. The ISO 31000 risk management standard refers to risk appetite as the “Amount and type of risk that an organization is prepared to pursue, retain or take”. This concept helps guide an organization’s approach to risk and risk management.”[4]

Wikipedia

Risk appetite is a really interesting concept.  The definition is that risk appetite is the level of risk that a person or organization is prepared to accept in pursuit of objectives. 

Why is Risk Useful?

Risk is necessary because we need to take risks to do almost anything. Every time we breathe in, every time we eat or drink something, we’re taking a risk.

It’s the same for businesses, enterprises, and nations.  If we keep on doing the same old thing again and again, eventually someone else will come along and outcompete us.  Ironically, the risk is that we fail to adapt and cease to exist – Darwinian selection. 

A great example of this is the Kodak corporation.  For years Kodak dominated the photography market.  However, they failed to see the promise of digital photography and didn’t take advantage of it. They were overtaken by rivals, and in the end, this mighty corporation went out of business.

So to ensure the survival of an entity, we must accept change, we must take risks. This seems to be true of populations, businesses – even software programs seem to illustrate this kind of evolutionary development [5].

Quantifying Risk and Appetite

In some areas of business, it’s easy to define risk appetite.  Financial corporations can easily define how much loss they are prepared to accept.  They can accept that a certain percentage of turnover or profit will be lost to fraud or error. 

A more sophisticated business might quantify the benefit of taking risks.  For example, lending more money might result in greater profits.  If a business understands the relationship between risk and opportunity, it can exploit it.

Too Big to Fail

A few years ago we saw the downside of that thinking.  Organizations thought they were too big to fail or too clever – they couldn’t go wrong.  Some high-profile failures lead to a domino effect, whereby many institutions effectively collapsed.  This was the Global Financial Crisis. 

As a result, the regulation of lenders was tightened up.  Banks and similar bodies were forced to keep higher reserves of cash and assets in order to survive miscalculations of risk.

How Much Risk is Enough?

So, how can we determine an appropriate risk appetite, without over-reaching ourselves?

This is a particularly difficult judgment when considering safety. Now we are not trading $ for $, we are trading dollars for injury and even death.  This is a much more difficult ethical problem.  There are various ways of making this judgment, for example in Australia we can refer to Safe Work Australia’s guidance

In this article, we will consider what leads us to a distorted perception of risk. 

Risk Perception

Some researchers claim that there are three factors that cause us to look at risk and misunderstand it.

Psychometric research identified a broad domain of characteristics that may be condensed into three high order factors: 1) the degree to which a risk is understood, 2) the degree to which it evokes a feeling of dread, and 3) the number of people exposed to the risk. A dread risk elicits visceral feelings of terror, uncontrollable, catastrophe, inequality, and uncontrolled. An unknown risk is new and unknown to science. The more a person dreads an activity, the higher its perceived risk and the more that person wants the risk reduced.[6]

Wikipedia

I have observed that people are ready to take more risks when they think they are in control.  For example, we’re more willing to take risks when driving, rather than in trains or planes where someone else is in control. 

It’s interesting to recall that our risk of death per journey is the same in a car as it is in a plane.  Moreover, we are three times more likely to be injured in a car crash than in an air crash.  Yet, people worry about flying, but they don’t think about the car journey to get to the airport. 

Therefore, if we are to think rationally about risk, we must address those three factors of risk perception – and control. 

Three Risk Perception Factors

First, we must understand risk.  Risk assessment helps us to do this and can help us make objective decisions.

Second, we must recognize feelings of dread, for example, fear of radiation.  We must strive to understand the mechanisms that give rise to risks so that we can understand how to treat or control them. This should give us confidence, which will counteract dread.

(Also, we might explicitly identify the benefits of the risky activity.  This should help us to deal with dread rationally.) 

Third, we must estimate the number of people exposed to the risk.  Accidents with multiple casualties cause Societal Concern and get a lot of media attention, whereas the constant background of individual casualties in car accidents goes largely unreported.

Let’s Look at Control 

We often have the illusion that we are in control, and that this will prevent accidents.

The night I had my most serious car accident, I was hit by a drug/ drunk driver.  I had not lost control of my vehicle and I had done nothing wrong.  However, when the other car turned into my path, I could not avoid the collision. 

We need to give people a realistic view of how much they really control. 

If we can give people control, without real adverse effects, then so much the better.  Either that or take away control completely and make sure that users know this.

Many fatalities have resulted from users misunderstanding how much control they had – for example over ‘self-driving’ cars.  

Outrage 

All these factors are challenging to deal with.  Moreover, there are a number of agents using social media to stoke and exploit public outrage. This is done for various purposes, which may have nothing to do with actual levels of risk (i.e. it not be a genuine societal concern).

Perhaps we can learn from those who manage outrage for enterprises that need it?  

They work to actively and regularly present a rational view of risks and benefits.  This is intended to counter the sensationalist reporting that will arise from time to time.  Think of it as a regular vaccine of rationality against periodic outbreaks of emotional outrage.   

Risk: Averse, Adverse, or Appetite? Conclusion

Of course, there are no guaranteed solutions or magic answers to these questions.

We will always have a subjective and visceral reaction to danger.  This is a good thing, essential even.  It’s a very important survival skill, and we should be afraid of things that can hurt us.

Yet, to live without risk at all is simply not possible – we will all die of something.  Will we achieve something meaningful before that dread day comes?

To do anything requires us to take risks.  As individuals, as a society, we need to take risks to enjoy the benefits that result.  “Great empires are not maintained by timidity” as a Roman historian once said[7].  

As in so many things, we are looking for a balance. 

How much risk-aversion do you need to survive, versus how much risk appetite to thrive?

(For more on risk management, see the FAQ.)


[1] https://www.merriam-webster.com/dictionary/averse#

[2] https://www.merriam-webster.com/dictionary/adverse

[3] https://www.lawinsider.com/dictionary/adverse-risk

[4] https://en.wikipedia.org/wiki/Risk_appetite

[5] Les Hatton & Greg Warr, Conservation of Information in Proteins, Software, Music, Texts, the Universe and Chocolate Boxes, Heiland Lecture, Colorado School of Mines, 06 Mar 2018.

[6] https://en.wikipedia.org/wiki/Risk_perception

[7] https://www.goodreads.com/quotes/313217-great-empires-are-not-maintained-by-timidity

Categories
Blog Safety Management

FAQ on Risk Management

In this FAQ on Risk Management, I will point you to some lessons where you will get some answers to basic questions.

Lessons on this Topic

Welcome to Risk Management 101, where we’re going to go through these basic concepts of risk management. We’re going to break it down into the constituent parts and then we’re going to build it up again and show you how it’s done.

So what is this risk analysis stuff all about? What is ‘risk’? How do you define or describe it? How do you measure it? In Risk Basics I explain the basic terms.

Risk Analysis Programs – Design a program for any system in any application. You’ll be able to:

  • Describe fundamental risk concepts;
  • Define what a risk analysis program is;
  • and much more…

If you don’t find what you want in this FAQ on Risk Management, there are plenty more lessons under Start Here and System Safety Analysis topics. Or just enter ‘risk’ into the search function at the bottom of any page.

The Common Risk Management Questions

Click here to see the most Commonly-asked Questions

why risk management, why risk management is important, why risk management is important in project management, why risk management plan is important, why risk management is important for business, why risk management matters, are risk management, are risk management services, is risk management important, is risk management framework, is risk management effective, can risk management be outsourced, can risk management increase risk, can risk management create value, how can risk management help companies, how can risk management be improved, how can risk management improve performance, how risk management improve organization performance, how risk management works, how risk management help you, how risk management helps, how risk management plans can be monitored, how risk management help us, how risk management add value to a firm, how risk management developed, what risk management do, what risk management means, what risk management is, what risk management is not, where risk management, which risk management certification is best, which risk management principle is best demonstrated, which risk management technique is considered the best, which risk management handling technique is an action, which risk management techniques, who risk management guidelines, who risk management, who risk management framework, who risk management tool, who risk management plan, who risk management strategies, will risk management be automated, how will risk management help you, how will this risk management plan be monitored, risk management will reduce, risk management will

Categories
Blog Safety Management

Safety Concepts Part 1

In this ‘Safety Concepts Part 1’ Blog post, The Safety Artisan looks at the meaning of the term “safe”. I look at an objective definition of safe – objective because it can be demonstrated to have been met.

This fundamental topic provides the foundation for all other safety topics, and it isn’t complex. The basics are simple, but they need to be thoroughly understood and practiced consistently to achieve success.

System Safety Concepts – highlights.

Safety Concepts Part 1: Topics

  • A practical (useful) definition of ‘safe’:
    • What is risk?
    • What is risk reduction?
    • What are safety requirements?
  • Scope:
    • What is the system?
    • What is the application (function)?
    • What is the (operating) environment?

Safety Concepts Part 1: Transcript

Hi everyone and welcome to the Safety Artisan, where you will find professional, pragmatic, and impartial advice. Whether you want to know how safety is done or how to do it, I hope you’ll find today’s session helpful.

It’s the 21st of September 2019 as I record this. Welcome to the show. So, let’s get started. We’re going to talk today about System Safety concepts. What does it all mean?  We need to ask this question because it’s not obvious, as we will see.

If we look at a dictionary definition of the word ‘safe’, it’s an adjective: to be protected from or not exposed to danger or risk. Not likely to be harmed or lost. There are synonyms – protect, shield, shelter, guard, and keep out of harm’s way. They’re all good words, and I think we all know what we’re talking about. However, as a definition, it’s too imprecise. We can’t objectively say whether we have achieved safety or not.

A Practical Definition of ‘Safe’

What we need is a better definition, a more practical definition. I’ve taken something from an old UK Defence Standard. Forget about which standard, that’s not important. It’s just that we’re using a consistent set of definitions to work through basic safety concepts. And it’s important to do that because different standards, come from different legal systems and they have different philosophies. So, if you start mixing standards and different concepts together, that doesn’t always work.

OK so whatever you do, be consistent. That’s the key point. We’re going to use this set of definitions from the UK Defence Standard because they are consistent.

In this standard, ‘safe’ means: “Risk has been demonstrated to have been reduced to a level that is ALARP, and broadly acceptable or tolerable. And relevant prescriptive safety requirements have been met. For a system, in a given application, in a given Operating Environment.” OK, so let’s unpack that.

System Safety – Risk

So, we start with risk. We need to manage risk. We need to show that risk has been reduced to an acceptable level. As required perhaps by law, regulation, or a standard. Or just good practice in a particular industry. Whatever it is, we need to show that the risk of harm to people has been reduced. Not just any old reduction, we need to show that it’s been reduced to a particular level. Now in this standard, there are two tests for that.

And they’re both objective tests. The first one says as low as reasonably practicable. Basically, it’s asking have all reasonably practicable risk reduction measures have been taken. So that’s one test. And the second test is a bit simpler. It’s basically saying reduce the absolute level of risk to something that is tolerable or acceptable. Now don’t worry too much about precisely what these things mean. The purpose of today is to note that we’ve got an objective test to say that we’ve done enough.

System Safety – Requirements

So that’s dealt with risk. Let’s move on to safety requirements. If a requirement is relevant, then we need to apply it. If it’s prescriptive, if it says you must do this, or you must do that. Then we need to meet it. There are two separate parts to this ‘Safe’ thing: we’ve got to meet requirements; and, we’ve got to manage risk. We can’t use one as an excuse for not doing the other.

So just because we reduce risk until it’s tolerable or acceptable doesn’t mean that we can ignore safety requirements. Or vice versa. So those are the two key things that we’ve got to do. But that’s not actually quite enough to get us there. Because we’ve got to define what we’re doing, with what, and in what context. Well, we’re reducing the risk of a system. And the system might be a physical thing.

Defining the Scope: The System

It might be a vehicle, an airplane, a ship, or a submarine, it might be a car or a truck. Or it might be something a bit more intangible. It might be a computer program that we’re using to make decisions that affect the safety of human beings, maybe a medical diagnosis system. Or we’re processing some scripts or prescriptions for medicine and we’ve got to get it right. We could poison somebody. So, whether it’s a tangible or an intangible system.

We need to define it. And that’s not as easy as it sounds, because if we’re applying system safety, we’re doing it because we have a complex system. It’s not a toaster. It’s something a bit more challenging. Defining the system carefully and precisely is really important and helpful. So, we define what our system is, our thing, or our service. The system. What are we doing with it? What are we applying it to?

Defining the Scope: The Application

What are we using it for? Now, just to illustrate that no standard is perfect. Whoever wrote that defense standard didn’t bother to define the application. Which is kind of a major stuff-up to be honest, because that’s really important. So, let’s go back to an ordinary dictionary definition just to get an idea of what it means. By the way, I checked through the standard that I was referring to, and it does not explain it in this standard.

What it means by the application. Otherwise, I would use that by preference. But if we go back to the dictionary, we see application: the act of putting something into operation. OK, so, we’re putting something to use. We’re implementing, employing it, or deploying it maybe we’re utilizing it, applying it, executing it, enacting it. We’re carrying it out, putting it into operation, or putting it into practice. All useful words that help us to understand.

I think we know what we’re talking about. So, we’ve got a thing or a service. Well, what are we using it for? Quite obviously, you know a car is probably going to be quite safe on the road. Put it in water and it probably isn’t safe at all. So, it’s important to use things for their proper application, to the use to which they were designed. And then, kind of harking back to what I just said, the correct operating environment.

Defining the Scope: The Operating Environment

For this system, and the application to which we will put it to. So, we’ve got a thing that we want to use for something. What’s the operating environment in which it will be safe? What is it qualified or certified for? What’s the performance envelope that it’s been designed for? Typically, things work pretty well within the operating environment, within the envelope for which they were designed. Take them outside of that envelope and they perform not so well.

Maybe not at all. You take an airplane too high and the air is too thin, and it becomes uncontrollable. You take it too low and it smashes into the ground. Neither outcome is particularly good for the occupants of the airplane. Or whoever happens to be underneath it when it hits the ground. All of those three things:  what is the system? What are we doing with it? and where are we doing it? All those things have to be defined. Otherwise, we can’t really say that risk has been dealt with, or that safety requirements have been met.

System Safety: why Bother?

So, we’ve spent several slides just talking about what safe means, which might seem a bit over the top. But I promise you it is not, because having a solid understanding of what we’re trying to do is important in safety. Because safety is intangible. So, we need to understand what it is we’re aiming for. As some Greek bloke said, thousands of years ago: “If you don’t know to which port, you are bound, then no wind is favorable.”

It’s almost impossible to have a satisfactory Safety Program if you don’t know what you’re trying to achieve. Whereas, if you do have a precise understanding of what you’re trying to achieve, you’ve got a reasonably good chance of success. And that’s what it’s all about.

Copyright

Well, I’ve quoted you some information from a UK government website. And I’ve done so in accordance with the terms of its Creative Commons license. More information about the terms of that can be found on this page.

End: Safety Concepts Part 1

If you want more, if you want to unpack all the Major Definitions, all the system safety concepts that we’re talking about, then there’s the second part of this video, which you can see here.

I hope you enjoy it. Well, that’s it for the short video, for now. Please go and have a look at the longer video to get the full picture. OK, everyone, it’s been a pleasure talking to you and I hope you found that useful. I’ll see you again soon. Goodbye.

Back to the Start Here Page.

Meet the Author

Learn safety engineering with me, an industry professional with 25 years of experience. I have:

•Worked on aircraft, ships, submarines, ATMS, trains, and software;

•Tiny programs to some of the biggest (Eurofighter, Future Submarine);

•In the UK and Australia, on US and European programs;

•Taught safety to hundreds of people in the classroom, and thousands online;

•Presented on safety topics at several international conferences.