When we look at the widely used and misunderstood tool of Root Cause Analysis (RCA), we should reflect its interpretation in our own environments. Think about it: when is RCA typically requested and applied in our environment? Based on my experience, it is typically requested and applied when:
- Someone is injured
- There is a catastrophic damage
- There is an environmental incident
- There is a “near miss” with high severity potential
- There is public scrutiny over an issue that makes the news
- There is a quality issue that a valued customer is complaining about
What do all these issues have in common? They are high visibility events that require immediate action at the request of authority. Usually in these circumstances, resources, time, and money are not an issue because of the level of management that is requesting the analyses be done. Typical scenarios tend to focus on the urgent, and not the important.
When utilizing an analysis tool we call Opportunity Analysis (OA), consider a one-time fire at a facility that results in $500,000 worth of damages. Costs such as these are unanticipated and not part of the budget, yet we almost always find the cash to recover. The accountants typically will use creative techniques to soften the blow such as amortizing the cost of the event over a 20-year period. The resulting impact would be viewed as $25,000/yr which is much more acceptable and digestible. Using the OA format, such a line item might look like Figure 1.
Now consider a chronic event such as conveyor belts that trip in a mining operation. On their individual impact they may take 15 minutes to reset. This 15-min period requires the attention of a person, which at a typical standard rate ($40/hr with benefits included) results in a cost per event of $10 (0.25 hr x $40/hr labor rate).
Because the event simply requires a person to find and reset the tripped conveyor system, generally no additional parts costs are involved. However, the 15-min delay causes a production loss upstream in the processing area, which equates to $5000/hr. Fifteen minutes now is worth $1250/occurrence (0.25 hr x $5000/hr production loss). So, each 15-min occurrence is now worth $1260 ($10 labor + $1250 lost production). Still considered a relatively low impact, right?
Now consider on this particular conveying system, we experience 40 such stoppages a week or 2080 for the year. Now we are looking at an annual impact to the bottom line of $2,620,800 ($1260/occurrence x 2080 occurrences).
The chronic event is approximately 100 times more costly, yet which event gets the most attention – the one-time fire or the continual tripping of a conveyor system? We all know the answer; the fire gets the attention because it is highly visible and requires an urgent response. The chronic event has been accepted as a cost of doing business and is considered part of the job. Herein lies the problem. Chronic events are rarely aggregated on an annual basis. They are typically viewed on their individual impacts.
Even bad actor lists are typically indicative of a short-term perspective. Because what may be viewed as important today, will not be as important next week (because something ‘more’ important cropped up).
Consider if we were to apply this OA format to an operation, a process, or a facility. We would seek out these hidden “nuggets” and determine their annual impact in dollars. This would tell us what the “carrot” was (the business case), and whether or not they were worth conducting a formal RCA on. Experience shows through the Pareto Principle, that when such a list is aggregated, the 20 percent or less of the events identified account for 80 percent or more of the dollars lost. This is a good technique to provide focus for a disciplined RCA effort.
So, where does the data come from to populate this type of spreadsheet? There are numerous means by which such lists can be developed, but how confident are we in the data. Think about this day and time, and where such information can reside: our SAP system, APM system, CMMS, etc. How many of us really believe that such systems accurately reflect the all the field activity, especially when it comes to the recording of every chronic event?
It has been my experience that when a chronic event occurs, from the perception of the person tasked to fix the undesirable event, it takes more time to input the information into the recording system that it does to fix the problem. Usually a negative connotation of the information system is involved, and it is deemed too cumbersome, so we will just fix the problem and be on our way. After all, that is what we are pressured to do – fix it and get production going again.
While we can get some information from such on-line monitoring systems, we must recognize that they are not all inclusive at this time. Only the people closest to the work will truly have the knowledge of the most chronic events. It is in their heads, not on paper!
Simple Chronic Failure Calculators like shown in Figure 3, make it easy for anyone to instantly calculate the annual cost of chronic failures. People in the field simply put in the failures they are experience, how often they are occurring/year and average cost/occurrence. This makes a very quick and compelling business case to a finance-minded person, to justify doing RCA on chronic failures that are otherwise hidden in plain sight.
Typically, most information systems are labeled and advertised as some type of asset management systems. So, failures that affect the asset are typically what are recorded. However, what may not be recorded are events that produce offspec product where no mechanical failure occurs, time delays as a result of a crane not showing up on time during a shutdown, time delays due to the wrong parts delivered to the site, or late deliveries to customers. Most systems that generate work orders, relate a work order to a repair. What if the chronic failures do not involve a mechanical repair? What if the ‘repair’ would be related to an organizational system flaw associated with a policy, procedure, purchasing habit, incentive system, etc.?
How do such asset management systems handle these events? Where is it recorded that such occurrences are undesirable and how are proactive recommendations from Root Cause Analyses processed in a timely fashion?
If we conclude in our RCA that procedures are obsolete, specifications are incorrect, or that people were not trained properly to perform a task, how are these situations handled in the asset management system? These questions are food for thought when we consider how well our current environment supports the task of RCA.
We can be the greatest root cause analysts on the planet, but if we are working on the wrong events and our environment does not support the proactive activity, then we are likely to become frustrated ourselves and fall into the paradigm that “if management does not care, then why should I?” Once this attitude sets in, complacency with a reactive culture is the norm and overall profitability suffers.
What we need to do today is make management aware through education and awareness that our cultures unknowingly tolerate these chronic events that typically end up costing 100 times more than the occasional sporadic event. Unfortunately, the sporadic events get all the attention. When our cultures are enlightened, we will begin to enjoy the fruits of our efforts in the form of return on investment (ROI) figures averaging 600%+. Then the believers will come!