
In practice, no system is failure-free. What often matters most is how quickly and effectively the system can be restored, and whether it is available when actually needed.
[Read more…]Your Reliability Engineering Professional Development Site
Find all articles across all article series listed in reverse chronological order.

In practice, no system is failure-free. What often matters most is how quickly and effectively the system can be restored, and whether it is available when actually needed.
[Read more…]by Semion Gengrinovich Leave a Comment

The exponential and log-normal distributions are widely used today due to their versatility and ability to model a broad range of real-world phenomena. The exponential distribution is favored for its simplicity and memoryless property, making it ideal for systems with constant failure rates. The normal distribution is ubiquitous due to its ability to model variables that are the sum of many small factors, such as measurement errors. The log-normal distribution is preferred for modeling variables that are the product of many factors, like times to repair or failure under stress. Together, these distributions provide powerful tools for analyzing and predicting outcomes in various industries, including quality control, reliability engineering, and predictive maintenance.
[Read more…]by Fred Schenkelberg Leave a Comment

It is deceptively easy to calculate MTBF given a count of failures and an estimate of operating hours. Just tally up the total hours the various systems operate and divide by the number of failures. Easy.
This simple calculation is the unbiased estimator for the inverse of the parameter lambda for the exponential distribution, or directly to estimate theta (MTBF). We use theta to represent 1 / lambda.
What could go wrong with such a simple calculation? [Read more…]
by Greg Hutchins Leave a Comment

Just because I don’t have a college degree doesn’t mean I’m not smart!
Emma Stone – Actress
What’s your value-add to me – your employer? This is what most employers think? Tough thinking. Well, that’s the basis of a free market economy.
Story: My doc’s brother went to Harvard and got a history degree. The world doesn’t employ too many historians. What to do? Later in life, the Harvard grad did a Brand You Pivot and reinvented himself as an elite coder. The problem is he makes as much money as a community college trained coder in the same company. What do these people have in common?
[Read more…]by Nancy Regan Leave a Comment

In Unit 9, we explore the wide-ranging benefits that can be achieved when Reliability Centered Maintenance (RCM) is applied correctly with the right people. Beyond developing a proactive maintenance program, RCM is often used to solve specific, chronic issues within organizations. Whether it’s expecting equipment to perform beyond its capabilities, conducting unnecessary or incorrect maintenance, or lacking proper operating and training procedures, these challenges can lead to safety risks, environmental concerns, increased costs, and lost production. However, when RCM is implemented effectively, it helps organizations gain control by clarifying equipment functions, identifying critical failure modes, and formulating targeted solutions. Through RCM, we create an effective proactive maintenance program, reduce unnecessary downtime, meet production goals, ensure safety and environmental integrity, and preserve the expertise of equipment specialists. Join me in this unit to learn how RCM can transform your organization by addressing these challenges and more.
[Read more…]by Joe Anderson Leave a Comment

In asset-intensive industries, workforce management is just as critical as the physical assets themselves. No matter how advanced an organization’s maintenance strategies, asset monitoring systems, or predictive technologies are, the success of asset management depends on the people who operate, maintain, and optimize those assets.
Workforce management in relation to asset management is about ensuring that the right people, with the right skills, are in the right place at the right time to keep assets running efficiently and reliably. A well-managed workforce leads to higher productivity, lower costs, improved safety, and extended asset lifespan.
Net Present Value is the workhorse of capital budgeting. It distills a project’s future cash flows into a single figure expressed in today’s dollars. For all its usefulness, NPV is not a perfect decision tool. Leaders who rely on it without understanding its constraints risk misallocating capital or overlooking strategic opportunities. NPV should inform decisions, not dictate them.
“We looked at 13 alternatives for the new reservoir,” stated the consultant. “And this one came out the best.”
My best you mean the one with the best Net Present Value,” I retorted. “And it’s really Net Present Cost, with a 50-year analysis period, and an 8 percent discount rate. Does any of that bother you?”
[Read more…]
In many plants, the unspoken goal of an investigation is to find a “who” to blame rather than a “why” to fix.
When an incident occurs, the immediate pressure for accountability leads teams to stop at human error. “Failure to follow procedure.” “Lack of training.” The root cause field gets filled, the corrective action says retrain, and the file gets closed.
This pattern is a hallmark of low-maturity programs. It creates a culture where subject matter experts hide critical data to protect themselves, and leadership is left with corrective actions too shallow to prevent recurrence.
[Read more…]
Root Cause Analysis (RCA) is the step that turns failure into learning.
Corrective and preventive actions are only effective if they address the real causes of failure, not just the symptoms. RCA provides the structured approach needed to understand why the system allowed the failure to occur in the first place.
[Read more…]by Fred Schenkelberg Leave a Comment

The acronym MTBF is commonly known in our field as “Mean Time Between Failures”.
It is also associated with repairable systems in most textbooks.
It is also denoted as the theta parameter for an exponential distribution.
It is also referenced as a metric for reliability. Oh, and it is the inverse of the failure rate.
And, it is misunderstood and misused by many. I digress, as there is plenty already written on the Perils of MTBF.
What is MTBF? And where and how should it be used, if at all? [Read more…]
by Ray Harkins Leave a Comment

This article is adapted from Chapter 10 of my book, Measuring Manufacturing Effectiveness.
The book examines how manufacturing organizations define, calculate, and use effectiveness metrics, and how those choices influence decisions, priorities, and behavior throughout the organization. While the chapters form a coherent framework, each is written to stand on its own for readers entering the series at different points.
By the time effectiveness metrics are calculated, the hardest work is often assumed to be complete. In practice, the opposite is usually true. Numbers that are technically correct can still be misunderstood, misapplied, or used to justify the wrong conclusions.
Chapter 10 focuses on the interpretation of effectiveness metrics rather than their calculation. It examines how context, assumptions, system boundaries, and organizational incentives shape what these metrics actually mean, and how the same number can support very different decisions depending on how it is interpreted.
The goal of this chapter is to help readers move beyond treating effectiveness metrics as objective scores and toward using them as diagnostic signals within a broader manufacturing system.
[Read more…]by Semion Gengrinovich Leave a Comment

Maximum Likelihood Estimation (MLE) is a statistical technique used to estimate the parameters of a model by maximizing the likelihood function, which measures how likely the observed data is under specific parameter values.
[Read more…]by Miguel Pengel Leave a Comment

A component-level Weibull reliability analysis of the AirPods Pro 2, and a hard look at whether the $29 extended warranty is protection or a profit centre.
I normally write a lot about asset management within the heavy industrial space, but a recent “domestic” applicance failure made me think about the economics and failure characteristics of everyday products- a little off subject, but defintiely on-topic!
My AirPods Pro 2 failed exactly one month after the standard warranty expired. The left earbud’s battery had degraded to the point where it would die within 45 minutes, rendering a $249 product functionally useless. Rather than just buying a new pair, I did what any reliability engineer would: I modelled the failure.
by Mike Sondalini Leave a Comment

Series system and parallel system properties provide valuable business and operational implications.
There are three reliability properties of equipment and parts in a series system arrangement. These properties impact on business and operational decisions and performance.
by Greg Hutchins Leave a Comment

Just when we wonder what is next for the Cyber Criminals, they attack the colleges and Universities. The financial costs of school ransomware, the days lost to downtime and the number of students impacted, as these incidents snowball and become a steady source of criminal income. The history of ransomware, the most damaging ransomware attacks, and the future for this threat.
[Read more…]
Ask a question or send along a comment.
Please login to view and use the contact form.