Common Procedural Execution Failure Modes during Abnormal Situations

Similar documents
Common Procedural Execution Failure Modes during Abnormal Situations

The Personal Profile System 2800 Series Research Report

STPA-based Model of Threat and Error Management in Dual Flight Instruction

Therapy symposium Formal Radiation Therapy Safety Processes

REQUIREMENTS SPECIFICATION

System Theoretic Process Analysis of a Pharmacovigilance Signal Management Process

Risk-based QM for Incorrect Isocenter at Day 1 Setup. TG 100 risk based QM development Process Mapping

Risk Ranking Methodology for Chemical Release Events

Top 5 Pitfalls of Spirometry And How to Avoid Them

Investigating Display-Related Cognitive Fatigue in Oil and Gas Operations

Use of the deterministic and probabilistic approach in the Belgian regulatory context

Prevention of Medical Errors: Concepts and Tools. Perry Johnson, PhD April 30, 2016

Achieving Effective and Reliable NDT in the Context of RBI

Committed to Environment, Health and Safety

International Journal of Computer Engineering and Applications, Volume XI, Issue IX, September 17, ISSN

BETTER TOGETHER 2018 ATSA Conference Thursday October 18 10:30 AM 12:00 PM

September audit deficiency trends. acuitas, inc. concluding thoughts. pcaob s audit deficiencies by audit standard.

Introduction to Applied Research in Economics

Aseptic Process Simulation (APS) Program Risk Assessment

A Comparison of Baseline Hearing Thresholds Between Pilots and Non-Pilots and the Effects of Engine Noise

Using the CFS Infrastructure

Lauren S. Tashman, Ph.D NE Second Avenue, Miami Shores, FL

An Approach to Applying. Goal Model and Fault Tree for Autonomic Control

Adaptation in the development context

17. Utilization of the Physical Therapist Assistant to Assist the Physical Therapist With Patient Screens

Modeling Terrorist Beliefs and Motivations

Getting the Payoff With MDD. Proven Steps to Get Your Return on Investment

Expert judgements in risk and reliability analysis

Formats for Section Safety Messages in Printed Manuals

Lewis & Clark National Estimation and Awareness Study

Report Reference Guide

Overview Internal review

Massachusetts Institute of Technology Engineering Systems Division

FDA issues final guidance on benefit-risk factors to consider in medical device product availability, compliance, and enforcement decisions

TG100 Implementation Guide - FTA

Ambiguous Data Result in Ambiguous Conclusions: A Reply to Charles T. Tart

People-Based Safety TM Embracing Empowerment, Ownership, and Interpersonal Trust

Validating Measures of Self Control via Rasch Measurement. Jonathan Hasford Department of Marketing, University of Kentucky

COGNITIVE ERGONOMICS IN HIGHLY RELIABLE SYSTEMS

How do Categories Work?

John Quigley, Tim Bedford, Lesley Walls Department of Management Science, University of Strathclyde, Glasgow

Chapter 11: Behaviorism: After the Founding

PRIVELEGED, NON-CLASSIFIED CARVER + SHOCK PRIMER

Social Determinants of Health

CHAPTER 3 RESEARCH METHODOLOGY

Modelling Research Productivity Using a Generalization of the Ordered Logistic Regression Model

Operator monitoring in a complex dynamic work environment: a qualitative cognitive model based on field observations

GEX Recommended Procedure Eff. Date: 09/21/10 Rev.: D Pg. 1 of 7

Evaluation of Linguistic Labels Used in Applications

Planning For The FDA s 'Deeming Rule' For E- Cigarettes

The Canadian National System for Incident Reporting in Radiation Treatment (NSIR-RT) Taxonomy March 11, 2015

Midterm #1. 20 points (+1/2 point BONUS) worth 15% of your final grade

Safety and Health Sign Colors and Signal Words for Communicating Information on Likelihood and Imminence of Threat

Slip on Catalyst Spill

A STEP IN THE RIGHT DIRECTION: Presented By. Part I Problems Underfoot EFFECTIVE SIDEWALK MAINTENANCE MANAGEMENT. It Didn t Happen Overnight

Establishing Special and Common Cause Variation

10/26/15 STRESS MANAGEMENT PRESENTER INDIVIDUAL & ORGANIZATIONAL CONSIDERATIONS. James Hunter. Director, Employee Assistance Program

An Agent-Based Diagnostic System Implemented in G2

Idhammar Whitepaper Implementing OEE Systems

The University of Michigan Radiation Oncology Clinic Analysis of Waste in the Radiation Oncology Clinic Patient Flow Process.

ACE Programme: Proactive Approaches to People at High Risk of Lung Cancer

Bill Wilson. Categorizing Cognition: Toward Conceptual Coherence in the Foundations of Psychology

Anticoagulation therapy : improving processes. Hilary Merrett and Fiona Gale, CHKS 18 th October 2010

How Many Sections Is The Cpt Manual Divided Into

National Culture Dimensions and Consumer Digital Piracy: A European Perspective

CARISMA-LMS Workshop on Statistics for Risk Analysis

Managing Major Hazard Process Safety Using Key Performance Indicators (KPIs)

Managing Major Hazard Process Safety Using Key Performance Indicators (KPIs)

COMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION

Discovering Meaningful Cut-points to Predict High HbA1c Variation

Stooped. Squatting Postures. Workplace. and. in the. July 29 30, 2004 Oakland, California, USA CONFERENCE PROCEEDINGS

Five Mistakes and Omissions That Increase Your Risk of Workplace Violence

March Paul Misasi, MS, NREMT P Sedgwick County EMS. Sabina Braithwaite, MD, MPH, FACEP, NREMT P Wichita Sedgwick County EMS System

Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials

IMPACT OF DEPENDENT FAILURE ON SIL ASSESSMENT

International Symposium on Children's Environmental Health

Ecological Interface to Enhance User Performance in Adjusting Computer-Controlled Multihead Weigher

Working Memory Load Affects Device-Specific but Not Task-Specific Error Rates

Laboratory Guideline for Pandemic Influenza Diagnosis at Medical Institutions

DRIVER S SITUATION AWARENESS DURING SUPERVISION OF AUTOMATED CONTROL Comparison between SART and SAGAT measurement techniques

July 3, Re: Proposed Rule: National Bioengineered Food Disclosure Standard. 83 FR (May 4, 2018). Docket No. AMS-TM

SEMINAR ON SERVICE MARKETING

Sex Differences in Depression in Patients with Multiple Sclerosis

The use of Topic Modeling to Analyze Open-Ended Survey Items

RECOMMENDATIONS OF FORENSIC SCIENCE COMMITTEE

ACO Congress Conference Pre Session Clinical Performance Measurement

Job Family: NDT Inspector Level 2 Job Title: NDT Inspector 2 nd Shift (02:30 pm 11:00 pm)

Commentary on: Piaget s stages: the unfinished symphony of cognitive development by D.H. Feldman

Rhythm Categorization in Context. Edward W. Large

Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults

QUARTERLY REPORT PATIENT SAFETY WORK PRODUCT Q J A N UA RY 1, 2016 MA R C H 31, 2016

MedDRA Coding Quality: How to Avoid Common Pitfalls

EPIDEMIOLOGICAL STUDY

Response to the ASA s statement on p-values: context, process, and purpose

VESTED ELEVENTH ANNUAL ENERGETIC WOMEN CONFERENCE

Key Features. Product Overview

MAHBOUBEH MADADI Industrial Engineering Louisiana Tech University, Ruston, LA Phone:

[EN-A-022] Analysis of Positive and Negative Effects of Salience on the ATC Task Performance

FMEA AND RPN NUMBERS. Failure Mode Severity Occurrence Detection RPN A B

COSTS AND EFFECTIVENESS OF IMMUNIZATION PROGRAMS

Transcription:

Common Procedural Execution Failure Modes during Abnormal Situations Peter T. Bullemer Human Centered Solutions, LLP Independence, MN Liana Kiff and Anand Tharanathan Honeywell Advanced Technology Minneapolis, MN The Abnormal Situation Management Consortium 1 funded a study to investigate procedural execution failures during abnormal situations. The study team analyzed 20 public and 12 private incident reports using the TapRoot methodology to identify root causes associated with procedural operations failures. The main finding from this investigation was the majority of the procedural operations failures (57%) across these 32 incident reports involved execution failures in abnormal situations. Specific recommendations include potential information to capture from plant incident to better understand the sources of procedural operations failures and improve procedural operations in abnormal situations. Introduction Process industry plants involve operations of complex human-machine systems. The processes are large, complex, distributed, and dynamic. The subsystems and equipment are often coupled, much is automated, data has varying levels of reliability, and a significant portion of the human-machine interaction is mediated by computers (Soken, Bullemer, Ramanthan, & Reinhart, 1995; Vicente, 1999). These systems are also social in that many plant operations function with a teamwork culture such that activities are managed by crews, shifts, and heterogeneous functional groups. Team members have to cope with multiple information sources, conflicting information, rapidly changing scenarios, performance pressure and high workload (Laberge & Goknur, 2005). Historically, the reporting of failures has tended to emphasize root causes associated with equipment reliability and less so on human reliability root causes (Bullemer, 2009). Consequently, there is limited information available on the frequency and nature of operations failures pertaining to human reliability. This tendency has limited the ability of process industry operations organizations to identify improvement opportunities associated with their management systems and operations practices. 1 This research study was sponsored by the Abnormal Situation Management (ASM ) Consortium. ASM and Abnormal Situation Management are registered trademarks of Honeywell International, Inc. Paper presented at the Mary Kay O Conner Process Safety Center International Symposium. College Station, TX. October 26-27, 2010.

Procedure Execution Failure Modes Page 2 of 9 In 2007-8, the ASM Consortium funded a study to analyze the impact human reliability in terms of operations practices failures with an ad hoc analysis of 32 public and private incident reports. The results of this research was presented at the 2009 Mary K O Conner Process Safety Symposium with an emphasis on the specific incident analysis methodology that revealed the opportunities for reducing abnormal situations associated with human reliability (Bullemer and Laberge, 2009). This paper presents the results of an analysis of the same 32 incidents from the perspective of procedural execution failures during abnormal situations. The research study objective was to develop an understanding of common failure modes in the application or execution of procedures, either manual or automated, during or in response to an abnormal situation. For purposes of understanding the focus of this research study, the reader should be aware of the ASM Consortium s definition of an abnormal situation. An abnormal situation is defined as a situation where an industrial process is disturbed and the control system cannot cope, requiring the operations team to intervene to supplement the control system. This definition may be narrower than other existing definitions of an abnormal situation. This definition is specifically used to distinguish between normal, abnormal and emergency situations. Typically, in a well rationalized alarm system, a process alarm is an indication of an abnormal situation in which the operations team needs to respond to return the plant to steady-state operations and avoid triggering emergency shutdown systems that are designed to prevent a release or catastrophic failure of process equipment. From a procedural perspective, a procedure developed for an emergency response was outside of the scope of this study. In addition, a procedural execution failure that leads to an abnormal situation or emergency situation was outside the scope of this study. However, a procedure developed for a temporary plant configuration or temporary operation (i.e., avoidance of an emergency response situation) was considered relevant to the scope of this study. Root Cause Analysis The study team examined root causes related to procedural operation failures across a data set from 32 incident reports (Bullemer and Laberge, 2009). Per the prior study definition, a root cause is the most basic cause (or causes) that can reasonably be identified that management has control to fix and, when fixed, will prevent (or significantly reduce the likelihood of the failure s (or factor s) recurrence (Paradies & Unger, 2000, p. 52). A root cause describes Why a failure occurred. In the prior research project, the team used the root cause tree available in the TapRoot methodology. The initial analysis examined all procedure related root causes from the prior analysis to identify those specific failures that were associated with execution during an abnormal situation. That is, for each identified procedure-related root cause, the study team assessed whether the procedural failure occurred prior to or during an abnormal situation. If it was determined that it was prior to an abnormal situation, the failure was classified as not relevant to procedure execution under an abnormal situation. The results of this initial analysis are shown below Table 1.

Procedure Execution Failure Modes Page 3 of 9 Table 1 Root cause analysis results summary of procedure related root causes relevant to execution in response to abnormal situations (Note: NI is abbreviation for Needs Improvement). Procedure Root Cause Summary Table Root Cause Category Subcategory Count ASM Relevance Procedure Followed Format confusing 1 1 of 1; 100% Incorrectly > 1 action / step 0 No Excess references 0 No Mult unit references 0 No Limits NI 0 Yes Details NI 2 0 of 2; 0% Data/computations wrong or 0 incomplete No Graphics NI 0 No No Checkoff 0 No Checkoff Misused 3 1 of 3; 33% Misused second check 0 No Ambiguous instructions 0 Maybe; No examples found Equipment identification NI 0 No Category Subtotal 9% 6 Somewhat: 2 of 6; 33% Procedure Wrong Typo 0 No Sequence wrong 0 No Facts wrong 3 2 of 3; 66% Situation not covered 25 20 of 25; 79% Wrong revision used 0 No Second checker needed 0 No Category Subtotal 40% 28 Mostly: 22 of 28 (79%) Procedure Not Procedure not used 19 9 of 19; 47% Used/Not Followed No procedure 15 7 of 15; 47% Procedure not available or 1 inconvenient for use 0 of 1; 0% Procedure difficult to use 1 0 of 1; 0% Category Subtotal 51% 36 Moderate: 16 of 36 (44%) All Procedures 100% 70 Majority: 40 of 70 (57%) The first two columns of Table 1 show the root cause categories and subcategories used in the analysis based on the TapRoot methodology (Paradies and Unger, 2000). The second column shows the total number of all root causes identified related to procedure failures. The last column shows how many of the identified procedure-related root causes were relevant to the execution during abnormal situations, i.e., relevant to the scope of this study s objectives. For example in the first row, there was one observation of a root cause of Procedure Format Confusing. A closer examination of this failure revealed that the failure to execute due to this root cause occurred during an abnormal situation. Hence, the last

Procedure Execution Failure Modes Page 4 of 9 column, ASM Relevance, shows 1 of 1 or 100% of the occurrences of this root cause subcategory was related to failure to execute during an abnormal situation. The bottom right corner of the table shows the result of classifying each root cause as ASM Relevant or not. The summary of the analysis shows that of the 70 identified root causes 40 are related to procedure execution failures in abnormal situations, i.e., 57%. Hence the surprising finding of this investigation was that the majority of the procedural operations failures across these 32 incident reports involved execution failures in abnormal situations. Common Manifestations To better understand the context of these execution failures, the study team examined the specific occurrence of the ASM relevant root causes in terms of their manifestation. Bullemer and Laberge (2009) argue that the manifestations of the root causes provide better insight into how to make improvements in operations practices than the more generic root cause classifications. A root cause manifestation is the specific expression or indication of a root cause in an incident. The root cause manifestations describe How operational failure modes are expressed in real operations settings. The root cause manifestation characterizes the specific weakness of an operations practice failure mode. Supervisor not accessible to control room personnel to discuss problems is an example manifestation for the No Supervision common root cause. Each root cause manifestation was classified into a set of common manifestations. A common manifestation is an abstraction of the individual root cause manifestations to characterize common element expressed across the incidents in a sample. The following common manifestations represent the common elements across the root cause manifestations identified in the 32 incidents: Fail to detect abnormal condition Fail to detect an abnormal situation Unaware of process or equipment hazard Lack understanding of impact of actions Execute inappropriate action These five common manifestation types were based on the ASM Intervention Model of the operator supervisory control activities shown in Error! Reference source not found. for preventing and responding to abnormal situations (Bullemer, Hajdukiewicz and Burns, 2010).

Procedure Execution Failure Modes Page 5 of 9 Figure 1 ASM Operations Intervention Model of human operator supervisory control activities for preventing and responding to abnormal situations. The ASM Intervention Model has four stages of activities: Orienting, Evaluating, Acting, and Assessing. In the first step of the analysis all root cause manifestations were categorized as a failure of one of the stages of the ASM Intervention Model. In the second step, the root cause manifestations were clustered in terms of specific ways in which the intervention phase was ineffective. The two detection failures, Fail to Detect Abnormal Condition and Fail to Detect Abnormal Situation are both manifestations of breakdowns in Orienting. Evaluation breakdowns are represented by Lack of Understanding Impact and Unaware of Hazards. Finally, the last failure type was Inappropriate Action which represents a breakdown in the Acting stage. The analysis did not identify any failures associated with the Assessing stage. Table 2 Frequency of common manifestation types associated with the root causes related to procedure execution in abnormal situations shown in rank order from most frequent to least frequent. Common Manifestations Inappropriate action Fail to detect abnormal condition Lack understanding of impact Fail to detect abnormal situation Unaware of hazard Frequency Definition 15 Failure to know what the appropriate response should be to the occurrence of an abnormal situation in the execution of the procedure 12 Failure to detect whether equipment or process is abnormal mode; or whether there are any latent abnormal conditions 8 Failure to understand the correct impact or effect of a procedural action or failure to know the impact of not following procedural instruction 4 Failure to know when normal operating range is exceeded; or know the indications of the occurrence of an abnormal situation 1 Failure to know about the existence of a hazard or the potential of a hazardous situation if a step or steps are not followed as specified

Procedure Execution Failure Modes Page 6 of 9 Total 40 Table 2 presents the results of the examination of the root cause manifestations in terms of the five common manifestations with their relative frequency across the 40 identified root causes. The summary of findings show that the most common manifestation was associated with lack of knowledge about appropriate responses to the occurrence of an abnormal situation while executing a procedure. The second most common manifestation was the failure to detect the presence of an abnormal equipment or process mode while executing a procedure. And the third most common manifestation was the lack of understanding the impact or effect of a procedural action or failure to execute a procedural action. In total, these top three common manifestations account for 35 of the 40 (87.5%) procedural execution failures under abnormal situations. These five common manifestations of execution failures are shown in Table 3 associated with the root cause subcategories. Together the information aids in understanding the potential mitigations for these types of procedural execution failures. Table 3 Rank order of root causes for common manifestations of procedure execution failures in abnormal situations. Root Case Subcategory Frequency Common Manifestations (Count) Situation Not Covered 20 of 25 Inappropriate Action (7) Fail to Detect Abnormal Conditions (7) Lack Understanding of Impact (5) Fail to Detect Abnormal Situation (1) Procedure Not Used 9 of 19 Inappropriate Action (4) Fail to Detect Abnormal Conditions (3) Lack Understanding of Impact (2) No Procedure 7 of 15 Inappropriate Action (4) Fail to Detect Abnormal Situation (2) Fail to Detect Abnormal Conditions (1) Facts Wrong 2 of 3 Fail to Detect Abnormal Situation (1) Lack Understanding of Impact (1) Check-off Misused 1 of 3 Fail to Detect Abnormal Conditions (1) Format Confusing 1 of 1 Unaware of Hazard (1) In addition, the team examined what types of procedural operations were most prone to these types of procedure execution failures. Further examination of this data as shown in Table 4 shows three procedure types are most likely associated with procedural execution failures in abnormal situations.

Procedure Execution Failure Modes Page 7 of 9 Table 4 Percent of procedure execution failures as function of procedure type shown in rank order of most to least. % of ASM Procedure Type Failures Startup 19% Operating: Batch 18% Operating: Continuous 12% Emergency Response 6% Transfer 1% LOTO/PTW/Maintenance 1% Shutdown 1% The data in Table 4 suggest that the procedure types most relevant to mitigating procedure execution failures associated with abnormal situations are Startup, Batch Operations and Continuous Operations. Corrective Actions Finally, the study team looked at the corrective action recommendations in the incident reports to understand what types of mitigations were most often recommended. Table 5 shows the frequency of corrective action recommendations in rank order from the most frequent to the least frequent. Table 5 Frequency of corrective action recommendations from incident reports shown in rank order of most frequent to least frequent. Incident Report Corrective Action Recommendations Frequency Improve procedure content by addressing abnormal situations 19 No procedure mitigation strategy recommended in report 13 Improve procedure coverage 10 Improve procedure content 4 Improve policy enforcement 3 Improve procedural training 3 Improve development method 2 Improve procedure format 1 Improve review work process 1 Improve status documentation 1 Of the incident reports with procedure related root causes, 13 did not contain any recommendations for corrective actions on the procedure management system. A total of 19 incident reports had the recommendation to improve procedure content to address

Procedure Execution Failure Modes Page 8 of 9 abnormal situations. In addition, 10 recommendations were made to improve procedure coverage. A take-away message from this finding is that in 59% of these incident reports (19 of 32), the incident investigation team observed a strong need to address abnormal situations in the development of procedures. With the additional analysis of common manifestations, the specific requirements for improving procedure content become more evident. Conclusion The post hoc incident analysis of 32 public and private incident reports indicates a process industry-wide need to improve human reliability in execution of procedures during abnormal situations. The incident investigation found that the majority of the procedural operations failures (57%) across these 32 incident reports involved execution failures in abnormal situations. Moreover, 19 of the 32 incident reports (59%) included a corrective action recommendation to improve procedural content for abnormal situations. As recommended in an earlier study, an examination of root cause manifestations provides additional information to the root cause information to better understand potential corrective actions (Bullemer and Laberge, 2009). In this study, the analysis of root cause manifestations suggests that improvements to procedural content in the following areas should improve operator reliability: Know what the appropriate response should be to the occurrence of an abnormal situation in the execution of the procedure Detect whether equipment or process is abnormal mode; or whether there are any latent abnormal conditions Detect when normal operating range is exceeded; or know the indications of the occurrence of an abnormal situation Understand the correct impact or effect of a procedural action or know the impact of not following the procedural instruction Ultimately, organizations need to improve the incident reporting system to include metrics that enable a better understanding of opportunities to improve their procedure management system. To the end, the study team is now developing a conceptual framework to enable the comprehensive representation of failure modes and root causes associated with breakdowns in procedure execution, in both normal and abnormal situations. References Bullemer, P. (2009). Better metrics for improving human reliability in process safety. Paper presented in the 11 th Process Safety Symposium at the 5 th Global Congress on Process Safety, Tampa, FL, USA. Bullemer, P., Hajdukiewicz, J. and Burns, C. (2010). Effective Procedural Practices. ASM Consortium Guidelines Book. Minneapolis, MN: ASM Consortium.

Procedure Execution Failure Modes Page 9 of 9 Bullemer, P. and Laberge, J. C. (2009). Common operations failure modes in the process industries. Paper presented at the Mary Kay O Conner Process Safety Center International Symposium. College Station, TX. Laberge, J.C., and Goknur, S.C. (2006). Communication and coordination problems in the hydrocarbon process industries. Proceedings of the 50th Annual Meeting of the Human Factors and Ergonomics Society, San Francisco, CA, USA. Paradies, M. and Unger, L. (2000). TapRoot. The system for root cause analysis, problem investigation, and proactive improvement. Knoxville, TN: System Improvement, Inc. Soken, N., Bullemer, P.T., Ramanathan, P., and Reinhart, B. (1995). Human-Computer Interaction Requirements for Managing Abnormal Situations in Chemical Process Industries. Proceedings of the ASME Symposium on Computers in Engineering, Houston, TX. Vicente, K. (1999). Cognitive work analysis. Mahwah, NJ: Lawrence Erlbaum Associates.