Manage All Failures or Else with James Reyes-Picknell

We’re excited to have James Reyes-Picknell, the author of Uptime, Strategies for Excellence in Maintenance Management, back with us. He’s also written Reliability Centered Maintenance – Re-engineered and Paying Your Way which is his latest book. James also trains and consults in maintenance, reliability and asset management areas across different sectors.

In this episode, we will be discussing:

What is a failure
Why do you need to manage failures
How do you forecast failures not yet experienced
Which strategies are available for managing failure states

…and so much more!

What is a failure?

One common explanation is that your equipment is trashed. However, failure is a loss of any function you want your physical assets to perform. Every asset has multiple functions it performs, which also needs different standards of performance. If you drop the performance in an asset, you’re in a failed state even though it might still be running.

What is managing failures?

Managing doesn’t mean you’ll prevent all failures from occurring. You can’t. What we manage is the consequences of the failure. You can, in some cases, prevent failures applied to assets that fail with age, usage, throughput, or wear. So, before it gets to that state, you can change out the necessary parts to bring it back to the original performance.

In the majority of cases, failures occur randomly, which makes preventing them tricky. You don’t know what or when it will happen. Failures are a natural thing that we have to accept.

Prevention means you’re managing the failure. However, predictive maintenance and condition monitoring get used for random and age-related usage failures, to give a sign that the failure has started and is progressively getting worse. So, you’re managing how you act on the failure, as well as the severity of the consequences.

Can you only manage past failures or potentially possible failures as well?

You can do both. If it’s happened in the past, then you’re sure it could happen again. That brings about the necessity of forecasting based on past events. You can also forecast what could happen in the future through making reasonable assumptions around your operating context and the stressors put on the equipment. That helps you determine how it might fail even if you’ve never experienced such failures. To do this, you’d look at what happens to similar equipment.
Why do you need to manage failures?

When you forecast a failure, you’re unaware of what the consequences might be unless it’s already occurred. When identifying failures, you acknowledge what could happen. That helps you decide on whether you should do something about it. In other cases, looking at the potential maintenance approach will help you decide if a proactive maintenance strategy is worth doing.

You have to look at all failures and understand them before making any decisions. That’s not to say something will get done. But decisions have to be made based on the costs versus benefits, as well as other factors.

What risks are you exposed to if you don’t manage failures?

The default maintenance strategy is always ‘Run to failure’. So, if you don’t manage the consequences and the failures that lead to them, you’re sure to suffer their consequences. It’s important to know that the risks you’re exposed to go beyond the physically losing an asset. They could include:

Business losses
Safety impacts
Environmental impacts
The fines that may get levied
Loss of life

How do you forecast failures not yet experienced?

It takes a lot of common sense as well as on the job experience. So, someone that’s unfamiliar with the equipment and how it could potentially fail will struggle with forecasting failures.

Start by looking at your maintenance history for past failures. You can then check your PM program if you have one since it’s probably already avoiding some failures. These may or may not be dealt with appropriately, so frequently reviewing them helps.

Then look at the things that could potentially happen even though you haven’t seen them. Looking at similar equipment failures could also help you forecast for potential failures with your assets. You can even base it on failures from site equipment from the same manufacturer.

Where to find resources to manage these

Resources refer to the people or funds to redesign or put online systems, redundancy, or design changes in place.

Your workforce is one resource.
Other resources you might need are predictive technologies. It boils down to making intelligent decisions to using the available money.
In the design phase, you need to decide whether you need a return on the investment or a return on the asset. Return on investment tends to lead to cost-cutting decisions since you’re initially justifying the project. Meanwhile, return on asset leads to better decisions around costs and revenues.

Which strategies are available for managing failure states?

Design changes are quite obvious. Preventive changes are time, usage, or throughput-based interval intervention, where you restore the condition of the asset or replace it. This tends to be expensive and gets done early in an asset’s life before it gets to its meantime between failures, which have a normal distribution.
Predictive maintenance, condition monitoring, condition-based maintenance, or on condition maintenance looks at the performance of the equipment against its performance parameters, or the condition of the various components of the asset. That’s by looking at signals that they generate when they start to fail. When you find a problem, you’ll see the early signs of failures in your system, allowing you to act on those warnings.
There’s testing or failure finding tests which deal with backup, safety, or alarm shutdown devices which get tested periodically to see if they’re working. To avoid having it not working when you need it, you need short testing intervals.
There’s also running to failure which applies when you’ve got minimal consequences of failure.
Redesign changes. These are one-time changes that eliminate the causes of failure related to human error or organizational issues resulting in human errors. Procedural, process and training changes contribute to eliminating the causes of some failures. Rarely is human error the result of an individual. A lot of factors come into play. The better we manage us, the better we manage our assets.

What translates to success with managing failures?

Have a proactive reliability mindset that keeps the reliability in mind. With that in focus, you’ll start asking what you can do to:

Improve reliability
Reduce downtime
Eliminate failure

You need to focus on the end result, not the activities. The activities are simply the methods and tools that get you there. What’s missing in a lot of companies today is a focus on reliability. You’ll hear talk of the maintenance that can be done rather than the achievable reliability. Remember, what matters is the consequences, not failures.

Eruditio Links:

James Reyes-Picknell Links:

240 – Manage All Failures or Else with James Reyes-PicknellJames Kovacevic

Download RSS iTunes Stitcher

Rooted In Reliability podcast is a proud member of Reliability.fm network. We encourage you to please rate and review this podcast on iTunes and Stitcher. It ensures the podcast stays relevant and is easy to find by like-minded professionals. It is only with your ratings and reviews that the Rooted In Reliability podcast can continue to grow. Thank you for providing the small but critical support for the Rooted In Reliability podcast!

Manage All Failures or Else with James Reyes-Picknell

About James Kovacevic

Leave a Reply Cancel reply