During a recent webinar I presented with SMRP, entitled “Root Cause Analysis: It’s a Money Maker, Not a Money Taker”, I was fortunate to have many deep thinkers in my virtual audience.
They floated some questions related to conducting ‘RCA’s on non-mechanical situations. To my fellow veterans in this RCA space, this is a relatively easy question to properly address, but it did get me to thinking. I think those of us who do this for a living, take for granted that we know a properly facilitated RCA, can apply to any undesirable outcome, no matter the industry or the bad outcome.
However, to those outside of our circles and looking in, ‘RCA’ is often viewed as something only applied to mechanical failures. I would venture to say that is the prevailing paradigm about what ‘RCA’ is.
I had a client ask me once who our biggest competitor was and I replied, “Our client’s definition of RCA is our biggest competitor, along with post-it notes!”. While they got a laugh out of it, I really wasn’t kidding. So, what did I mean by that?
To be effective investigators, we must have a grasp of the bigger picture when it comes to understanding how and why failures occur. To that end, we will briefly discuss:
I. Every Process is Part of a Bigger System
II. All Outcomes are a Result of a Series of Cause-and-Effect Relationships
III. We Can See Bad Outcomes, But We Can’t See Human Reasoning (Thought Processes)
I. Every Process is Part of a Bigger System
Successful RCA Analysts are ‘Systems Thinkers’! In a paper entitled ‘Get to the Root of Accidents’ the noted authors cite that systems thinking is not currently used in RCA.
I never like to make such definitive statements, but in general, I would have to agree that most of what I see around the world in terms of what people call ‘RCA’, there is little appreciation for system’s thinking.
This article defines Systems Thinking as:
“Systems thinking is an approach to problem solving that suggests the behavior of a system’s components only can be understood by examining the context in which that behavior occurs. Viewing operator behavior in isolation from the surrounding system prevents full understanding of why an accident occurred — and thus the opportunity to learn from it.”
It’s hard for me to see how an RCA can be effective, without such systems thinking. If an RCA analyst is not exploring the surrounding environment of a decision-maker for proper context to their decision-making, I would not consider that an RCA. I would consider that a ‘Shallow Cause Analysis‘ because I think they stopped short of the true systemic root causes that influenced the decision-makers.
For simplicity’s sake, and not getting into a ton of theory from research/academia, just visualize any process as a system. A ‘system’ is comprised of Inputs of some kind, they are Transformed by the process in some way, and then they produce an Output in the end (Figure 1). Think about wherever you work and put that mental image in your mind.
Here are some very simplistic examples both mechanical and non-mechanical systems:
1. Paper Mill System (Figure 2)
2. Hospital/General System (Figure 3)
3. Purchasing System (Figure 4)
4. Blood Drawing Process in Emergency Room (Figure 5)
The point of this mental exercise is that wherever a failure occurs, it is somewhere within a bigger system. As an RCA analyst, we need to understand that bigger system to put the failure into proper context.
II. All Outcomes Are a Result of Cause-and-Effect Relationships
“There are few things in life that you can be absolutely sure of, but one of them is that nothing ever just happens. There is always a cause and effect relationship.” – R. Keith Mobley
Undesirable outcomes don’t just happen! This statement is gospel to a veteran RCA analyst who is relentless in the pursuit of a pattern or sequence of cause-and-effect relationships that queued up on any given day, to produce an adverse outcome.
Here are just some cross-sections of logic to demonstrate such cause-and-effect relationships. Level-to-level we just ask ourselves ‘HOW COULD’ the previous block have occurred?’ Our answers are simply our hypotheses (possibilities) that must be validated as true or false, using sound evidence (not just hearsay).
1. Shaft Failure (Figure 6)
2. Fire (Figure 7)
3. Medication Error in a Hospital (Figure 8)
4. Purchasing Delays (Figure 9)
5. Blood Redraws in an Emergency Room (Figure 10)
Notice how the cause-and-effect levels, describe systems that in some fashion contributed to bad outcomes. Now let’s move on to learn how to understand, what we cannot see!
III. We Can Observe Bad Outcomes, But We Can’t Observe Poor Human Reasoning
In our previous examples, we have been trying to demonstrate that no matter the bad outcomes, the structure of how to analyze it, is the same. When we look at the physics of failure (what we can see), the HOW CAN questioning will reveal many possibilities (hypotheses) that could have contributed to the last effect (in the cause-and-effect sequence). ‘HOW CAN’ questioning is very broad compared to ‘WHY’ questioning. So, what is the difference between HOW CAN and WHY questioning?
Seems like a frivolous and insignificant point I’m trying to make, but it’s really quite an epiphany. Let me try and bring this home with an example.
Is there a difference between asking “How could a crime have occurred?” versus “Why a crime occurred?”
At some point during our continual drill down through the physical aspect of a bad outcome, we will inevitably come to a decision error. This will either be an error of commission (I did something I shouldn’t have) or an error of omission (I should’ve done something I didn’t). From an independent RCA analyst’s standpoint, who made the decision is irrelevant (unless it was found to be sabotage, which extremely rare). What we are more interested in is WHY they felt the decision they made was the right decision, at the time!
At this point in our logic tree reconstruction where we stumble across a decision maker, our questioning switches from HOW CAN to WHY. We are not interested in HOW COULD someone have made the decision, because there would be an infinite number of possibilities. By asking WHY, we are seeking to learn what their reasoning was at the time, which made it seem to be the right decision. Almost always, we will learn their reasoning at the time made perfect sense, when the context of their environment is taken into account…understanding the big picture.
Let’s explore a few examples of when we switch to the WHY questioning:
1. Unexpected Process Shutdown. This is a case where excessive vibration eventually caused a fatigue failure of a critical pump bearing. Excessive vibration was occurring because it was misaligned during initial installation. Why would someone not align the pump properly? (See Figure 11)
2. Fire in a Patient’s Lung During Surgery to Remove Tumor. During an endoscopy a fire initiated inside the right bronchus of the patient. It was obvious there was oxygen in the room and a laser (ignition source), but where did the fuel come from. Was determined by the RCA that purchasing had switched vendors for the antiseptic used to sterilize the instruments. They move to a vendor’s product that had alcohol in it (fuel). Why would a buyer do that? (See Figure 12)
3. Medication Error. Patient had an adverse drug reaction after receiving the wrong medication. The RCA found that the Pharmacist had intentionally reduced the number of meds in the formulary. When the nurse went into the dispensing system and requested the med she wanted, it was not available and defaulted to the least expensive member in the same class. Why would the pharmacist reduce the meds in the formulary? (See Figure 13)
4. Process Upset Due to V-Belt Failure. RCA found that failed V-Belts did not meet spec. It was determined that purchasing had switched vendors and the V-Belts were not like-for-like. Why would a buyer do that? (See Figure 14)
As we dissected each of these examples, in hindsight, their pattern is not hard to follow. In the end, flawed organizational systems (latent root causes) contribute to inappropriate decisions. As a result of these decisions (human root causes), they tend to trigger a cause-and-effect chain of events (physical root causes). When the chain is unbroken, at some point an undesirable outcome occurs that will hit an RCA trigger point in the organization, and an analysis will follow (See Figure 15).
In Conclusion: RCA has little to do with the industry to where it is applied! The common denominator associated with the effective application of RCA…is the HUMAN BEING. It is our decision-making that triggers consequences. However, decision-makers are often the victims of flawed organizational systems that influence their decisions.
Successful RCA is about understanding human reasoning…why good people thought the decision they made at the time, was the right one. The only way to figure this out is to understand the context in which the decision was made and not make assumptions about what ‘we would have done’. Hindsight is always 20/20 and a bias that needs to be kept in check. This is the juncture where the social sciences meet the physical sciences.
Whether we are analyzing unexpected failures in a manufacturing process, customer complaints, why patient’s get the wrong medication or why kids bully each other, the disciplined RCA thought process to do so, is basically the same!
If you’re interested in seeing actual examples of this claim, here are some video case studies:
1. Safety Case – Hand Injury
2. Elementary School – Bullying Case
3. Retail Pharmacy – Wrong Med Dispensed
4. Admin Case – Why RCA Effort Not Working
5. Customer Complaints – Black Specs Found in Product: