If you had to give a grade to your current Root Cause Analysis (RCA) initiative, what would it be? How would you come to that conclusion (grade)? The paradox many face with such initiatives is drawing the distinction between compliance and actual effectiveness. What would our RCA grade be based on? In this article we will focus on the key elements to quantifiably measure your RCA initiative, so the organization can focus on the elements of the initiative that are lacking.
For those of us on the front lines who do the actual RCA work, we know it can be a thankless job. We understand the grave consequences in which we are investigating and direly want to prevent recurrence to ensure the safety of our employees as well as ourselves. However, we are realists and things don’t always go as planned. We start with the greatest of intentions in our investigative efforts, but there are serious potholes in our road to success.
In this article I want to craft an assessment tool for you to evaluate your current RCA system, based on the individual perceptions of those doing the work (the front line RCA analysts). As we begin to craft this unbiased assessment of reality, keep in mind we want our analyst inputs to be very candid, based on their personal experience. We will ask them to apply a Likert rating scale of one (1) to five (5), with ‘1’ indicating we strongly disagree and ‘5’ indicating we strongly agree. When applied, there are no right or wrong answers, just what people perceive in the real world. We also want these assessments to be anonymous, so we can get the candor we seek. It is not important who filled the assessment out, it is their input we want to focus on.
Since true RCA is a system, and not merely a task, we will break down the RCA system into its sequential components (subsystems). It will be more digestible this way. Let’s get started!
1. Management Support Systems – Setup
In this section we will explore the management support systems in place, to aid our analysts in the field. We will make statements relative to these systems and seek input from those who are affected by them.
These should be self-explanatory. They attempt to establish the existence of a solid support foundation for the RCA effort. If the analysts give these elements poor ratings, this indicates the support systems in place are weak for various reasons. They could be weak because they don’t exist, they exist and are inadequate, they are adequate and not followed, they are adequate and not enforced, etc. From an initial assessment standpoint, if this section overall is rated poorly, then we know we must give attention to these support systems and determine why the rank and file feel they are not effective. If the people doing the work feel they have little to no support, they will do the minimal amount of work to get by. They will take on the paradigm, ‘If leadership doesn’t care, why should we?’
2. Proactive RCA – Use of FMEA & Opportunity Analysis
This section focuses on how and where we apply our RCA approaches. Traditionally, we only apply formal RCA per some type of reactive triggers (i.e. – costs, injury, death, regulatory violation, fines, etc.). These are usually set by various regulatory agencies that require the use of RCA to investigation the more severe undesirable outcomes. However, more progressive organizations, especially those on their journeys to being High Reliability Organizations (HRO), will seek to apply RCA proactively. So what does that mean?
FMEA. Most are familiar with the risk assessment tool referred to as Failure Modes and Effects Analysis or FMEA. This tool is designed to quantifiably measure risk using the following universal calculation:
Probability x Severity = Criticality (or Risk Prioritization Number – RPN)
The end result of this type of analysis is a prioritized listing of the highest risks in a given process.
There is absolutely no reason an RCA cannot be applied to an unacceptable, high risk event (or a near miss which can also be a high risk). This is truly a proactive application of RCA because we are analyzing why risks are so high, instead of responding to the consequences (reacting) of risks that have already materialized.
OPPORTUNITY ANALYSIS (OA). Most are not familiar with this tool, but it will be your greatest friend when attempting to communicate with your finance people. The OA tool essentially makes the business case, complete with potential ROI, for applying RCA to chronic failures that do not rise to the level of hitting defined triggers. These chronic failures are the ones that happen every shift (viewed as ‘a cost of doing business) and create the need to develop workarounds, because we don’t have time to fix the flawed management system problems. These are the failures that on their individual occurrences seem insignificant, but when calculated over a year’s time, they are eating our lunch!! .
The basic calculation for conducting an OA is:
Frequency/Year x Impact/Occurrence (Labor$ + Material$+Lost Profit Opportunity$ [downtime])
In this section of our RCA assessment, we would want to get input from our analysts on the following:
While there are only two rows in this section, they are important because they make the distinction between proaction and reaction, and how they feel their organization applies RCA.
3. Preserving Event Data – Evidence Collection
In terms of the most critical steps to any investigation, evidence collection is the most important to me. Yet, it is the most undervalued as well. This is because most RCA’s in our working environments are often time-pressured. When anyone is time-pressured to do anything, they likely will take a short cut. In the RCA world, the most time consuming task in a proper RCA will be in preserving and collecting evidence. Therefore, under time-pressured conditions, proper collection of evidence is expensed. This will definitely adversely affect the integrity of any RCA.
This is important because we want to know the reality of whether or not the RCA analyst feels they are provided adequate time and flexibility to collect the evidence they need, to conduct an effective RCA.
4. Organizing an RCA Team
This section relates to our ability to put together a proper RCA team and ensure the team is focused on the task at hand. All of us who have put such teams together, know it is like herding cats. Oftentimes meeting time is wasted because all members do not show up (or show up late), or they show up and they are not prepared. Team organization and dynamics are critical to the overall success of an RCA.
In this section we want to see if our facilitators are unbiased and not put there for political reasons. We want to ensure our facilitators do not have anything to lose or gain by the outcome of the analysis. We need to ensure diversity of team members so that many perspectives are represented. Our teams need to be manageable and in smaller groups. Larger teams tend to be political and have members on them for oversight purposes (sometimes to ensure certain conclusions are reached).
We also want to make sure are teams are focused and know their purposes from the beginning. Getting such feedback from analysts will see if they feel they are just going through the motions or if they feel they are being productive.
5. Analyzing the Event (Undesirable Outcome)
This section deals with the actual analysis of the evidence collected earlier. This is the part that deals with the graphical reconstruction of the event. Our RCA methodology capability comes into focus here.
When we talk about ‘latency’ here, we are referring to understanding deficiencies in our management systems. In any investigation we will come across people that made poor decisions (errors of omission and/or commission). We are not as interested in ‘who’ made a poor decision (unless it is known sabotage which is very rare), but ‘why’ they felt it was the right decision at the time. This drills down into peoples’ minds and searches for the rationale for the decision.
When we explore to these depths, we will usually find a person relied on data from inadequate management systems and/or poor practices that had evolved over time (getting used to short cuts). Leadership needs to understand the concept of latency, as it is key to preventing recurrence.
Our RCA approach needs to have breadth and depth when reconstructing a failure. Reconstructions using cause-and-effect expressions of logic are the most effective. Brainstorming and ‘pick list’ RCA approaches do not allow the sequential stringing of logic to demonstrate how a failure progressed (especially from multiple paths at the same time – non-linear).
When coming up with hypotheses to reconstruct our failure we must use sound evidence to prove or disprove them. Hearsay is NOT a valid form of evidence. When evaluating evidence, using a weighting system (i.e. – 0 – 5), is a good way to validate evidence. A ‘5’ means with the evidence in hand, the hypothesis is absolutely true. Conversely, a ‘0’ means with the evidence in hand, the hypothesis is absolutely not true. In between are shades of gray resulting from lack of conclusive evidence.
Productive RCA’s will never stop just at a physical failure (i.e. – equipment failure) or at a human root (decision error where we simply discipline someone for a bad decision). True RCA seeks to get past the decision-maker and to explain the human reasoning for the decision (latent roots). Most effective RCA’s will find their roots in flaws of management systems, human factors and human performance related issues.
Here is a completed video case study if you’re interested?
6. Communicating Findings and Recommendations
At this point our RCA is completed and now we have to develop, sell and implement our solutions. Remember, RCA is a ‘system’ and not a task. This is yet another critical link in the RCA chain. This is because if we can’t sell the need for our recommendations, all the investigative and analytical work we did, was a waste of time (plus we would be less driven to do a great job next time).
So our tasks to evaluate in this section are:
As analysts, we have to ask ourselves ‘What is our definition of success?’ for our analysis. Compliance should NOT be the definition of success for an RCA. In order for an RCA to be successful, there has to be some type of bottom-line improvement. Something has to get better as a result of your RCA, what is that? Simply clicking a checkbox from a list indicating your RCA is complete, is not a measure of success. That just means the determination of causes may be complete, but we still have nothing to show for it on the bottom-line.
Most RCA’s tend to drop off a cliff at this point, because there is a lack of accountability for the recommendations. Each recommendation should have a person assigned to complete them, with a due date. Each recommendation should have a cost/benefit calculation attached to it, to measure ROI. This will greatly aid in the selling of the recommendation to finance people.
7. Tracking for Bottom-line Results
To complete our loop with RCA being viewed as a ‘system’, closure will be that a measureable, demonstrable benefit has been realized. This means that we have to have tracking mechanisms in place to measure the effectiveness of each recommendation and for the RCA overall.
Rounding out our RCA system, these are some tasks that we should be concerned about when it comes to measuring effectiveness:
As part of our RCA management support systems, Leadership should tell us what their expectations are for the RCA initiative. Oftentimes this is correlated to the corporate dashboards and/or KPIs. We should be able to demonstrate that our RCA’s are narrowing the gaps of such corporate metrics.
Are there systems in place that will oversee if assigned tasks are actually being implemented? Without such oversight, if someone is not doing their task and there is no negative consequence, they likely never will. They have other priorities and this one is low, especially when no one is checking to see if the task was done.
Are there systems in place that will ensure the results of the RCA are shared across the organization so that others can learn from them? One of the greatest benefits of an effective RCA system, is the creation of a living and growing knowledge management system. This would be a database of RCA experience, or ‘corporate memory’. This would prevent people from doing the same RCA’s over and over again, just because they did not know one had been done in the past. Imagine the costs of re-work when we have to do RCA’s over and over again.
Are we reporting our RCA ROI’s back to our Leaderships to justify the existence of our RCA initiative? I can assure you as a CEO myself, if I see such initiatives saving my company millions of dollars/year, I will continue to invest in such initiatives. As an FYI, our documented average ROI for our case study database is over 600% (as published in our books). That will raise the brow of any finance person.
Last but not least, are we reporting our RCA results back to those in the field who provided input to the analyses? If not, we should be. This is because they will see, they were part of something successful and they will be motived to help in the future as well.
Conducting such an assessment on an annual basis will allow us to measure our progress. Such an assessment will identify which sections we are strong in, as well as where we could use improvement. In our weaker areas, we can take corrective actions to shore up the RCA system and help our analysts’ be the best they can be.
After writing this long article, you may be thinking this is a lot of work and I don’t have time to create such an assessment tool. Well, today is your luck day, because we have done this for you!
Please feel free to take this assessment on your own. See what your RCA Report card looks like and compare it to national averages we have been collecting for years. Here is a video to show you how to use the assessment tool.
WANT TO KNOW WHAT YOUR RCA REPORT CARD LOOKS LIKE, GIVE IT A SHOT…IT’S FREE!