An undesirable event occurs (fancy term for unexpected failure) a Root Cause Analysis (RCA) is triggered. This usually means what occurred is a severe event as triggers are often set pretty high (i.e. – reportable injury/fatality, equipment damage in excess of x-thousands of dollars, production losses in excess of x-thousands of dollars, regulatory violation, etc.). Since there is urgency and visibility, how do I decide who will lead the investigation/RCA? Our natural tendency is to identify the technical ‘expert’ in the nature of whatever the undesirable event was. But does that typically produce the most effective outcome for the organization and its employees? This article will focus on which skills are needed most, under such conditions, and why
Laying the Groundwork
In this article we will just focus on what are the appropriate leadership concerns for conducting an effective RCA, under triggered conditions. Under such conditions, there is likely an elevated level of stress because of the severity and magnitude of the event. The event has the attention of the ‘suits’. These suits can represent leadership, lawyers, insurance companies and OEMs. If the suits are involved then there could be culpability and liability at play, so that can sometimes throw a wrench into a logical, well-reasoned and evidence-based analysis, as there is a potential for the introduction of various biases that could attempt to alter the facts.
Defining Key Terms
There are no universally accepted terms I am aware of for defining titles associated with leading RCA teams. Such terms I often hear in the field are RCA Team Leader, Lead Analyst, Facilitator, Investigator, Principal Analyst and Subject Matter Experts. The best I can do is just express how we define such terms and roles at Reliability Center, Inc. (RCI), and let the readers mold that framework, into their realities. It may fit, it may not…but you will be the judge.
I will group all these terms into just two (2):
RCA Principal Analyst (PA) – The PA is the one who holds the team members accountable for action items, assigning the data collection tasks and keeping management informed of the RCA progress and barriers. The PA has the connections to leadership and company personnel in general. Such a role is typically internal. We will expand on these responsibilities in a minute.
Subject Matter Expert (SME) – The SME has deep domain knowledge related to the technical issues of the failure being analyzed. We will expand on these responsibilities soon, as well.
What’s the Difference Between an RCA Principal Analyst and a Subject Matter Expert?
What I’m about to share with you are my opinions, based on having been in this business 36+ years now. I don’t expect everyone to share the same opinions, but I do look forward to learning from differing perspectives.
Over this span of time, while the technologies have changed dramatically, the human condition has not been as progressive. We still stress out under emergent conditions, we will have our biases and still worry about our job security when it comes to being honest, candid, and factual in our RCA’s.
The technologies involved with RCA’s today typically help to efficiently document and communicate analyses and validate hypotheses (i.e. – predictive technologies, lab results and the like). However, while they collect and aggregate content, the technologies themselves often do not make judgments as to the appropriateness and accuracy of that content. This is the subjective aspect of leading an RCA team. With this responsibility comes the potential for letting our own biases seep in and cloud our judgement.
I find that oftentimes using the word ‘investigation’ or ‘investigator’ puts out a negative vibe with a legal tone. This automatically puts people on the defensive just because of the language. I use this term sparingly and only when needed (such as the next paragraph).
Side Bar: RCA’s vs. Legal Investigations
Even though the steps of any ‘investigative’ endeavor are very similar, our legal system has a different objective for their ‘investigations’, as opposed to us in industry using our ‘analyses’. In our legal system there is always a plaintiff and a defendant. There is also a resulting winner and loser. While seemingly obvious, this also can have a polarizing effect on evidence. Academia calls this ‘confirmation bias’. This is when evidence comes in to support our legal position, we welcome it with open arms. However, when evidence is introduced counter to our legal position, we seek to discredit it…hence the polarization of evidence.
In manufacturing when conducting RCA’s, we are not often held to a legal standard for evidence (i.e. – preponderance of evidence, beyond a reasonable doubt). However, that can change if the suits are involved and the RCA is being conducted under the direction of a lawyer, legal department, or law firm. That’s a whole other article if we get into completing the final RCA report with the team, and what that same report may look like after legal review. We are PA’s, not lawyers. So, what they do with the RCA we provided is usually beyond our control. They have their objectives and purposes for the company, and we have ours.
To get back on track (after my digression), let’s assume we are undertaking a normal RCA and are not conducting so at the direction of a lawyer.
In the marketplace, the term RCA is often used as a noun and the different brands of RCA are the adjectives describing the noun. The differing RCA brands must create their uniquenesses via their marketing. But at the core, the investigative steps are essentially the same. As an example, our RCA brand is called PROACT® and Figure 1 shows our PROACT RCA Process Flow Diagram (PFD).
This should be fairly self-explanatory, but the fundamental steps of the investigation/analysis are:
1. Preserving evidence related to the Event being analyzed (just like a crime show on TV)
2. Ordering the analysis team to weed out bias as best they can (or at least minimize it)
3. Analyzing the Event by using graphical cause-and-effect reconstruction tools (capable of depicting parallel paths of failure converging). This will incorporate the evidence collected along with the insight of the RCA team members
4. Communicating our findings for approval and ensure the corrective actions are assigned and properly implemented in a timely manner
5. Tracking for bottom-line results. Making sure that whatever corrective actions were implemented, they actually improved something that is measurable
So, for us at RCI, this represents the discipline of our RCA process of choice. Others may use different approaches, but the key to effectiveness is sticking to the discipline of the approach and not taking short-cuts (i.e. – slacking on collecting evidence due to time pressures for instance).
Now that we understand the process to be followed, let’s explore the key responsibilities of PA’s and SME’s.
What are the key responsibilities of a PA?
1. Driving the successful completion of the RCA. The PA must deal with:
a. determination of the Objective (purpose and intent) of the RCA and what defines ‘success’
b. potential obstacles given the technical issues, litigation issues and corporate politics surrounding the Event
c. team dynamics (see detailed listing below)
d. sufficiency of evidence
e. comprehensiveness and depth of analysis
f. appropriateness of corrective actions, approvals, implementations and tracking for effectiveness
g. political correctness/diplomacy
2. Facilitating and not participating
a. Asking the right questions,
b. Getting the right answers out of the SME’s on the team
c. Listening (not just hearing)
3. Administration of due process in adhering to the discipline of the analysis, ensuring:
a. there are no short cuts taken that could impact accuracy of findings
b. that hearsay is not permitted as a valid form of evidence
4. Communicating goals and objectives to leadership
a. Keep leadership apprised of the status of the RCA
b. Having 1:1 conversations with appropriate parties when sensitive evidence is uncovered
c. Ensuring the RCA process is consistent with the objectives and expectations of leadership
5. Accountable for entire RCA
a. Acknowledge that YOU OWN IT. This RCA is your baby, and you will defend it as such.
b. Don’t pass the buck to team members when challenged, have confidence in your team’s work and be ready for challenges…because the facts are on your side (when done effectively).
c. Ensure the RCA meets the defined criteria for ‘success’ as outlined by leadership, prior to commissioning of the RCA
Given the above responsibilities, what are the challenges the PA often must face with their teams?
1. Bypassing Steps: There will be a natural inclination for an RCA team to bypass steps and go right to solution (bypassing the ‘analyze’ phase). This is natural when time pressure is present, we tend to take short cuts. As depicted in Figure 2 (with some humor), it is the PA’s job to prevent this from happening.
2. Floundering: This usually occurs when there is no disciplined, methodical RCA approach being used. Team members don’t see the end game, so they flounder and wonder ‘when will this be over’.
3. Accepting Opinions as Facts: As stated above, some team members will feel that because they said it, it is true. It is the PA’s role to validate such opinions.
4. Dominating Team Members: We know who these people are, the ones who dominate meetings with their aggressive personalities. The PA should allow them equal time to be heard, but not allow them to deprive others of the same respect.
5. Reluctant Team Members: There are generally two reasons team members tend to be reluctant to participate, 1) they fear persecution for being honest and/or 2) they are worried that by being honest, they will get someone else in trouble. As a PA, when we see this situation, we should either 1) take that person aside in confidence, for a 1:1 after the meeting or 2) switch our meeting format to a nominal group technique where we go around the table and ask each person to offer their thoughts.
6. Digression: This usually happens about 10 minutes after floundering! When meetings lack structure and an end game, people tend to see it as another unproductive meeting. At this point they are looking at the clock or their watches…observe the body language of your team for these signals.
7. Going Off on Tangents: Team members will sometimes need to vent about a certain issue before we can get to the ‘meat’ of the topic. As PA’s we should let them vent, but only for a minute or two. Use the agenda as your prop to keep on time but give them a minute or two to state their case. They will feel better when they have been listened to. Ironically, I often find this ‘baggage’ that people want to vent about, occurred years if not decades ago. After someone vents and is feeling relieved, politely ask them ‘how long ago did that happen?’. If it was a long time ago, no more needs to be said, as they will all know why you asked the question. Much could have changed since the injustice was served.
8. Arguing Among Team Members: Constructive debate is expected; arguing is not permitted as it tends to be inter-personal and counter-productive to the team’s charter.
As we can tell, the success or failure of an ‘RCA’ will be dependent on the skill level of the PA. The PA plays a key role in administering the RCA process, dealing with team dynamics, and playing the politics involved with getting things done. Diplomacy is a key trait of an effective PA because they should expect they will uncover sensitive issues that will have to be dealt with delicately.
For example, say an operator or mechanic followed a procedure to the letter, and it resulted in harm or an explosion. Oftentimes this begins the hunt for ‘whodunnit’. What if our thorough RCA, determined the procedure followed, was obsolete? Better yet, what if the procedure was written by the PA’s boss’s boss (if I did that grammatically correct😊)? This is where those diplomacy skills come into play.
We must always keep front and center in our minds, that our role as a PA is to try to eliminate the risk of recurrence. I have always considered my primary role as PA…is to make leadership look good!! It is not about my taking credit for the success, but for the leaders to get the credit for providing support for the RCA team and the overall RCA effort. This is often a means of feeding egos, but if that’s what it takes to prevent recurrence, so be it. I have found in my career, that those that I had made look good in the past, remember that, and know who really was behind the success. Use that hidden ‘beholdenness’ to your advantage down the road, when that fast-track leader is going up the ladder and bringing you along.
As we all know, a good Reliability Engineer’s greatest trait is their humility. Why else would one get into an industry where success is defined by what DIDN’T happen! How many of us often get letters of appreciation for preventing something from occurring?
What are the Responsibilities of the SME’s?
As you can pretty much tell by now, the PA is running the ‘RCA’ show! In terms of administering the discipline of the RCA process, that falls in the lap of the PA. However, the PA is dependent on the qualifications of their SME’s. This is because the SME’s are
the team members and critical to the success of the RCA. So, what are the primary responsibilities of the SME’s?
1. SME’s are knowledgeable on the technical issues of the problem. They have deep domain expertise related to the nature of the failure being analyzed.
2. SME’s provide hypotheses as the PA facilitates the reconstruction of the Event. The PA’s role may be to ask ‘How Could?’ and ‘Why?’ questions, but the SME’s role is to provide valid answers to those questions.
3. SME’s assist in identifying, collecting and analyzing evidence, to prove or disprove whether hypotheses presented are true or not true.
4. SME’s ensure the comprehensiveness, depth and accuracy of the evidence-based RCA.
5. SME’s understand that hearsay is NOT a valid form of evidence.
What is NOT a responsibility of the typical technical SME? They are not expected to:
1. have deep domain knowledge of the RCA process/discipline
2. have the ability to provide leadership and coordination for the RCA effort
3. have diplomacy skills to deal with the politics associated with some RCAs
I have never been to a facility in my career that didn’t have the internal (and/or corporate) resources to solve their own problems! A good PA’s role is to be skilled at asking the right questions and getting the information/facts from the SME’s in a disciplined manner. A quote from Eli Goldratt (author of The Goal) applies here…“An expert is not someone that gives you the answer, it’s someone that asks you the right questions!’ That is the role of the PA.
Can an SME Fulfill the Dual Role as PA? If So, Should They be Internal or External?
The quick answer is Yes, but there is an ‘it depends’ consideration as well!
We must understand the circumstances. Severity of the Event being analyzed will definitely be a factor in decisions made around who will be the PA, as well as the team members.
When we have high visibility failures, there will likely be corporate involvement. They may take charge of the RCA oversight and demand investigative independence (un-biased). Under such conditions, this would play into any decisions about the roles needing to be filled from internal or external candidates.
I would like to draw a distinction between an internal SME and an outsourced SME. High visibility failures may very well have concerns about liability (either legal, insurance, warranty or otherwise). In these cases, ‘independence’ will likely be a critical factor when selecting a PA. Using an internal PA could be viewed as having a bias towards the company’s position. Hiring a 3rd party independent PA would seek to demonstrate neutrality. In these cases, the PA could fulfill the dual role of an independent SME.
The end result is that the decision for assigning the SME the role of the PA, should be based on a review of the need to have a strong unbiased RCA resulting in effective recommendations addressing the undesired event.
As a rule of thumb where bias is concerned, the PA should have nothing to lose or gain by the outcome of the analysis. The facts (evidence) will be, what they will be…whether we like it or not.
While what I have expressed would be ‘ideal’, we all know that oftentimes we must play the hand we are dealt. In those cases, we very may well end up with a PA that is also an internal SME. If that is the case, we simply must be very cognizant of our potential biases and contemplate what our critics will try to do, to discredit our work product.
Conversely, we may have a capable PA and NOT have an internal SME on a specific topic. So, we may have to outsource that talent for the purposes of our RCA. Like I said earlier, the PA must be resourceful and know when certain technologies and talents are needed…and where to find them.
As always, I’m very interested to hear from those in the field that face these situations on a daily basis. I’d like to hear stories about what worked and what didn’t, so we can all learn from others’ successes and failures!
Collectively, we have unity in purpose when it comes to defeating this paradigm:
‘We NEVER seem to have the time and budget to do things right, but we ALWAYS seem to have the time and budget to do them again!’
About the Author
Robert (Bob) J. Latino is currently CEO of Reliability Center, Inc. (RCI). RCI is a 48-year-old Reliability Engineering Consulting firm specializing in Equipment, Process and Human Reliability. Bob has been facilitating RCA’s with his clientele around the world for over 36 years and has taught over 10,000 students in the PROACT® RCA methodology.
Recent books by Bob and his brothers: