In my travels over the past 35+ years talking to RCA analysts around the world, as well as those outsiders who look into our ‘RCA’ bubble, I find many misconceptions about RCA. This happens in every space, just think about RCM, RBM, APM, CBM and the like; everyone experiences how other people view their craft.
One of the more popular myths about an effective RCA approach, is that ‘RCA’ is obsolete because it promotes linear thinking. I’ll admit this is a more pervasive paradigm emanating from the Safety community, than from the Reliability community; but nonetheless it exists. I’ve seen this as well from leadership in cases where their RCA initiatives were not producing the results they expected. What they don’t often consider is the many reasons why such efforts do not meet expectations, including lack of leadership support and/or lack of clear expectations.
Is this a Fact or a Myth? I will leave the conclusions up to those that are actual RCA veterans who will answer based on their experience, rather than from hearsay.
I believe that much of this linearity belief comes from those that consider all RCA to be the equivalent of the traditional use of the 5-Why’s (not modified versions like 5×5).
If everyone considered all RCA as the equivalent structure of this 5-Whys approach, then our nay-sayers would be right…but unfortunately that is not the case. In a traditional 5-Why approach, one would simply ask themselves WHY, 5x deep and the would arrive at THE root cause. Let’s look at this from a technical standpoint and not an ‘us’ versus ‘them’ perspective.
Asking only ‘Why’ is a very narrow line of questioning that promotes linearity. Because it connotes that we want a singular answer (linear) and that we want someone’s opinion. The fact is that undesirable events that occur in complex organizations, don’t happen linearly. Unfortunately cause-and-effect relationships happen in parallel most of the time. Things happen in various combinations, on any given day, and come together to form a unique sequence of factors that result in bad outcomes. So using a traditional 5-Whys to analyze such occurrences, will not yield a comprehensive understanding of what actually went wrong.
CONSIDER FIRST ASKING ‘HOW CAN?’
This seems like semantics but really is an epiphany! Most undesirable outcomes are observable (they are not decision reasoning stored in someone’s head that we can’t see). Therefore, there were physics at play that lead up to that bad outcome, that we could see.
To make this point consider the difference between asking “How a crime occurred?’ versus ‘Why a crime occurred?’. Are your answers the same?
This change in initial questioning is the difference between linearity and non-linearity. Consider the use of Boolean Logic gates. I’m not going to make this over-complicated and get into the Boolean Algebra because it’s not necessary. I’m going to use the basic Boolean Logic gates we use for our RCA’s (using our PROACT RCA methodology).
Let’s try some practical examples to make our points. If I was in the midst of an RCA (based on how I personally would approach an RCA), I might come across a process that failed due to a fatigued bearing. My next natural question would be ‘How could a bearing fatique?’ From a logic standpoint, because I don’t know the correct answer yet until I have adequate evidence, I would use a Boolean AND/OR gate to explain my logic (Figure 2). We use this symbol when exploring what may have happened.
My possible answers (hypotheses) to that question may be, Resonance, Misalignment and/or Imbalance. Whatever the evidence proves to be true, we continue to delve deeper on. Oftentimes there are two paths are true and we follow them both. The evidence leads the analysis, not the analyst.
Let’s try another logic gate, this time using an AND gate. I may be investigating an Event where a fire was involved. My question remains the same, ‘How could a fire have occurred?’ My logic tree branch may look like the following in Figure 3.
We are all familiar with the fire triangle, where in order to have a fire we need adequate oxygen, fuel AND an ignition source. So this is how our logic may be expressed.
To make my final point, let’s use another basic Boolean Logic gate, the symbol for OR. When using an OR gate, we are making binary statements. Let’s assume we are investigating an unexpected process shutdown involving a critical valve. Our question may be ‘How could the valve have failed?’ (Figure 3).
At a higher level, until we get into the deeper physics, the valve either could have failed open or closed. We would have to let our evidence tell us which was verified to be true.
In summary, I just wanted to express the differences between logic expressed linearly versus the reality of complex environments where logic happens in parallel and in combinations.
Think about the above examples I provided using the logic gates. What if we applied the traditional 5-Why approach to those cases, would there be a difference in our results? Would we have only 1 conclusion (root cause)? Would there likely have been more root causes had we expanded our questioning and used evidence instead of hearsay to back up our hypotheses?
In this article, we did not explore how this type of exploration eventually covers the human and systems contributions to bad outcomes, but there is more reading on that correlation available.
These may seem like semantics to outsiders but they are valid and critical differences between an effective Root Cause Analysis and a Shallow Cause Analysis.
If you’re interested in learning more about our PROACT RCA Approach, we’d love to hear from you…let’s make it happen!
Leave a Reply