Uptime Insights - 8 - Asset Reliability

Last Verified September 27, 2021

You can wait for something to break, then fix it, or you can be proactive and manage the failure before it causes you problems. Being proactive is all about managing failures and their consequences before they occur. The failure itself, in some cases, is unavoidable, but how you manage consequences is entirely within your control.

You can reduce or eliminate the consequences of failure by forecasting what is likely to happen and deciding in advance about what to do about it. Major business impacts are the consequences of risks and those are manageable.

Reliable operations are no accident. All plants and equipment are designed to be reliable, but not achieve it. Those that do are managed proactively.

The advantage is that reliable plants are much less expensive to maintain, they run longer and more predictively, produce more, are safer and less likely to result in environmental or other non-compliances. Reliable plants are good for business!

To achieve that requires the right sort of effort focused on reliability and a foundation of good practices in the “Essentials” area of the Uptime Pyramid of Excellence (i.e.: work management, basic care, materials management, performance management and use of technology). Let’s assume those are in place – not perfect, they don’t need to be, but not a chaotic mess either.

Reliability Centered Maintenance (RCM) is the most proven proactive approach for developing maintenance programs from scratch. It is a logical process that asks seven seemingly easy to answer questions. It uses our knowledge of how things fail, how those matter in your operation, and walks us through a decision process to arrive at sound and defensible failure management policies. RCM avoids or minimizes the consequences of failures. One of the great strengths of RCM is that it does not require failures to have occurred in order to generate data for analysis – it anticipates the most likely failure modes and deals with them before you suffer the consequences.

RCM results in a safe minimum amount of appropriate proactive maintenance. It balances cost and risk vs. reliability and is tailored specifically to your operating environment. The tendency to over- or under-maintain, often a result of using other methods or following the manufacturer recommendations, is avoided. RCM should be the cornerstone of your reliability program – at the very least, for your critical assets.

The keys to success in RCM are the careful application of the process itself and follow-up by implementing the results in your maintenance program. Many failures of well-run RCM programs occur because the outputs of the analysis are not put into practice in the operational environment. The follow up is critical. Optimizing the maintenance program after it has been put into place is done on a continuous basis. You can “PMR/O” (proactive maintenance review/optimization) and root cause analysis methods for refinement.

Preventive Maintenance Review/Optimization (PMR/O) is a method based on RCM logic that is applied to existing maintenance programs in an attempt to optimize them. PMR/O arose out of the need to improve the performance of existing maintenance programs that failed to meet desired performance expectations. RCM logic is used in analyzing the various maintenance activities of the existing program in order to eliminate or modify them. It attempts to identify failure modes that may have been missed by the original maintenance program but it is not as thorough as RCM.

Root Cause Failure Analysis (RCFA) is entirely reactive to failures that have already occurred. RCFA is a method of performing a sort of “post mortem” to determine what caused any particular failure. The intent is to eliminate the “root cause”, that being the identifiable cause that you can manage in some practical way. Because you do it after the failure has occurred it generates excellent results, but you don’t want to consider it for developing a whole new maintenance program.

Decision optimization techniques and tools help maintainers to make fact-based decisions or to improve on decisions already made. RCM can be used before the asset is put into service (as it is in the aircraft and nuclear industries). Decisions made about task frequencies and failure modes are then made with some degree of uncertainty, but with an experienced team, results are excellent.

Optimization techniques are used to analyze the in-service data to validate or modify the original decisions. It requires failure data that has been accumulated in service, but that data is often flawed. Interpreting that data requires great care.

Reliability and simulation modeling are techniques that allow us to mathematically model the behavior of our installed systems. They can reveal where we have process bottlenecks, and if we are working to improve on one bottleneck, where the next ones are likely to arise. These models can also show us the effect of various reliability improvements at different points in the systems and help us focus our engineering efforts more effectively.

About James Reyes-Picknell

Leave a Reply Cancel reply