MTBF and an Indicator
Abstract
Chris and Fred discuss this thing called the MTBF … and how it (perhaps!) can be used in some reliability engineering applications … sometimes!
Key Points
Join Chris and Fred as they discuss if (and how) the MTBF can be useful for reliability applications. But haven’t we been really, really, really adamant that it is a bad thing? And to be clear … the MTBF stands for Mean Time Between Failure. Wouldn’t we want to measure the MTBF to see if things are failing less often?
Topics include:
- Can the MTBF be helpful? Perhaps. But only as an INDICATOR of system health. That is, if you try to improve the reliability, availability and maintainability performance of a product or system, you would expect the MTBF to improve (get better). So it then becomes an indicator that can help validate if there is an improvement in something.
- But the MTBF is not helpful as a PARAMETER. What does this mean? If you want to improve availability by optimizing servicing intervals, then the MTBF will not help you in any way. You instead need to understand the Rate of Occurrence of Failures after Servicing (ROFAS) of your system, model the maintenance-induced failure rate of your servicing activities and so on to find the right servicing interval. And once you do, you will see an improvement in overall system MTBF without the MTBF of any components being used to get this improvement. In fact … trying to use the MTBF in order to improve the MTBF … usually gets in the way of improvement.
- … and the MTBF comes with BAGGAGE! The MTBF is the most over-used, ridiculously simplified metric in the world of reliability engineering. Many people believe that the reliability IS the MTBF. It isn’t. Trying to do reliability and availability improvement using the MTBFs of components and system elements NEVER works. It hides the information you need to make the right decisions as it is over-simplified. And so it is very hard for an organization to have the MTBF as an indicator to NOT have its toxic over-simplification seep through the rest of reliability, availability and maintainability decision-making. Which is why we are VERY CONCERNED WHEN ANY ORGANIZATION USES IT!
- What’s the answer? Start with understanding what decision you are trying to make. What you are trying to improve. What the value of that improvement is. And then truly understand HOW your system will likely fail (the ‘vital few’). Study and understand those ‘vital few’ and what makes them happen. Before you try and characterize them with a number, try and remove the ‘root causes’ that allow them to happen. Quickly. And once you know your ‘vital few’ and have exhausted all the ‘fast, simple and cheap’ corrective actions that will improve reliability, THEN characterize the likelihood of failure over time (which needs more than the MTBF). And keep going!
Enjoy an episode of Speaking of Reliability. Where you can join friends as they discuss reliability topics. Join us as we discuss topics ranging from design for reliability techniques to field data analysis approaches.
- Social:
- Link:
- Embed:
Leave a Reply