
5-Why is it popular because it’s simple—and that’s exactly where it can fall short.
Teams ask “why” five times, land on a familiar answer, document it, and move on.
The exercise feels efficient, but the thinking is often shallow.
[Read more…]Your Reliability Engineering Professional Development Site

5-Why is it popular because it’s simple—and that’s exactly where it can fall short.
Teams ask “why” five times, land on a familiar answer, document it, and move on.
The exercise feels efficient, but the thinking is often shallow.
[Read more…]by Nancy Regan Leave a Comment

In this unit, we delve into the second and third steps of the Reliability Centered Maintenance (RCM) process: identifying Functional Failures and writing Failure Modes. It’s vital to distinguish between Failure Effects and the actual causes of failures. A Failure Mode, often termed a failure cause, specifically pinpoints what specifically causes a Functional Failure. Proper training is essential, as accurately identified Failure Modes are the cornerstone of developing effective failure management strategies. We will introduce the four criteria for including Failure Modes in an RCM analysis. This session uses practical examples, such as bearing failures, to demonstrate how well-defined Failure Modes lead to actionable solutions like vibration analysis or enhanced training programs. Learn to balance the level of detail in your documentation to prevent oversimplification and avoid analysis paralysis, ensuring your RCM process is comprehensive and effective.
[Read more…]
Understanding asset value is one of the most fundamental questions in facility, infrastructure, and asset management. Yet it’s also one of the most misunderstood because different disciplines approach value from different angles. “Asset value” is not a single number—it’s a collection of perspectives formed by accounting, insurance, engineering, and operations. Getting clear on those perspectives is the first step toward making better decisions.
Before covering the many ways to express asset value, it helps to define three basic concepts that support most of the others.
[Read more…]
Reliability engineering rarely happens in isolation. More often, it sits within a project environment shaped by cost, schedule, scope and competing priorities.
In many projects, reliability engineering can be seen primarily as a quantitative exercise that is applied once evidence is needed to validate a design. By then, the opportunity to influence architecture, technology choices, or support concepts may be limited.
The greatest impact of reliability engineering often comes much earlier, through structured questioning and risk-informed thinking. Helping teams recognise that reliability engineering influences design and decision-making throughout the project, not just when evidence is required, is part of the reliability engineer’s contribution within a project environment.
by nomtbf 9 Comments

This one made me scratch my head and wonder. Did I read this right?
A reader sent me an excerpt of a document found on Vicor’s site.
“Reliability is quantified as MTBF (Mean Time Between Failures) for repairable product and MTTF (Mean Time To Failure) for non-repairable product. A correct understanding of MTBF is important. A power supply with an MTBF of 40,000 hours does not mean that the power supply should last for an average of 40,000 hours. According to the theory behind the statistics of confidence intervals, the statistical average becomes the true average as the number of samples increase. An MTBF of 40,000 hours, or 1 year for 1 module, becomes 40,000/2 for two modules and 40,000/4 for four modules…”
by Semion Gengrinovich Leave a Comment

How safe is the modern roller coaster? Media attention to amusement park injuries and fatalities have led to concerns about passenger safety and potential brain injuries resulting from faster, more complex rides that may cause greater stress on the rider.
West European ice slides, popular in the 16th and 17th centuries, are the earliest ancestors of the present-day roller coaster. Ice blocks were fashioned into sleds, and sand created friction to slow down the sled at the end of the ride. As popularity increased, wooden sleds were built with iron runners to increase the speed and intensity of the ride.
[Read more…]by Greg Hutchins Leave a Comment

Don’t be intimidated by what you don’t know. That can be your greatest strength and ensure you do things differently from everyone else.
Sara Blakely – Founder of Spanx
Robots are coming is a common refrain in Tech Futures. Robots and smart machines are doing a lot of our work and will do a lot more over the next few years. Take a look below:
[Read more…]
Across every mature RCA program we’ve seen, one pattern is unmistakable: leaders who consistently close the loop win.
Not sponsors who attend kickoffs. Not managers who ask for status updates. Leaders who personally ensure that corrective actions are implemented with the same rigor used to identify root causes have RCA programs that thrive.
[Read more…]by Nancy Regan Leave a Comment

In this unit, we delve into the second and third steps of the Reliability Centered Maintenance (RCM) process: identifying Functional Failures and writing Failure Modes. It’s vital to distinguish between Failure Effects and the actual causes of failures. A Failure Mode, often termed a failure cause, specifically pinpoints what specifically causes a Functional Failure. Proper training is essential, as accurately identified Failure Modes are the cornerstone of developing effective failure management strategies. We will introduce the four criteria for including Failure Modes in an RCM analysis. This session uses practical examples, such as bearing failures, to demonstrate how well-defined Failure Modes lead to actionable solutions like vibration analysis or enhanced training programs. Learn to balance the level of detail in your documentation to prevent oversimplification and avoid analysis paralysis, ensuring your RCM process is comprehensive and effective.
[Read more…]by George Williams Leave a Comment

Is your CMMS cluttered with spare parts and throwaway items? Not sure what qualifies as an asset—or why it matters?
In this video, we walk through the logic behind building a clean, effective asset portfolio. Learn how to define what belongs in your CMMS or EAM system, how to structure your asset hierarchy, and how to avoid the common mistakes that derail reliability strategies.
✅ What counts as an asset
✅ What to leave out
✅ How to build decision logic that lasts
✅ Why this matters for ISO 55000 and long-term success
If you’re tired of second-guessing your asset list, this is your starting point.
[Read more…]
Message to “Inside FMEA” readers. This is a reprint of a message that went out earlier this week as part of the Accendo Weekly Update. I wanted to be sure to reach any “Inside FMEA” readers who may have missed the original request.
My new eBook is in draft form and ready for review. I am very excited to introduce this project, as it is the culmination of a career in reliability engineering and FMEA. The title is Achieving Effective FMEAs: Simple Principles for Realizing the Full Potential of Failure Mode and Effects Analysis. It summarizes and simplifies articles I’ve written over the years as part of the Inside FMEA Series, with added emphasis on principles, examples, exercises and case studies. Each chapter has a section called “Potential AI Application” which shares where AI is useful, and where human involvement is essential. [Read more…]

Performance monitoring is often where reliability intent meets operational reality and where many well-intentioned reliability programmes quietly lose focus.
Most organisations monitor something, such as failures, availability, response times or costs. The challenge is choosing indicators that genuinely reflect system performance, rather than those that are simply easy to collect or report.
[Read more…]by Fred Schenkelberg Leave a Comment

One of the issues I’ve had with failure modes and effects analysis is the focus on failure modes.
The symptoms that the customer or end user will experience are important. If a customer detects that the product has failed, that is a failure. The FMEA process does help us to identify and focus on the important elements of a design that improve the product reliability. That is all good.
The issue is that the FMEA process doesn’t go far enough to really aid the team in focusing on what action to take when addressing a failure mode. The process does include the discussion of the causes of the failure mode. The causes are often the team members’ educated opinions on what is likely to cause the failure mode. Often the description of a cause is a failed part, faulty code, or faulty assembly.
Generally, the discussion of causes is vague.
by Semion Gengrinovich Leave a Comment

On November 24, 2000, PacifiCorp experienced a massive generator failure at its Hunter Power Plant in Castle Dale, Utah. Post-event inspection of the generator revealed a serious failure of the stator core—a cylindrical structure nearly 19 feet long and more than 16 feet across—which had partially melted. At the time, the generator was operating at its maximum capacity of 415 megawatts. Sparks and heavy arcing were observed before the unit tripped automatically, shutting the system down within 55 minutes.
[Read more…]by Ray Harkins Leave a Comment

This article is adapted from Chapter 8 of my book, Measuring Manufacturing Effectiveness.
The book explores how manufacturing organizations define and use performance metrics, and how those definitions influence operational decisions, improvement efforts, and management behavior. While the chapters form a connected framework, each is written to address a specific aspect of manufacturing effectiveness and can be read independently.
Performance loss is often described in overly simple terms—the equipment is running slower than it should. While speed is certainly part of the story, this narrow view hides a much broader set of losses that affect output, flow, and stability.
Chapter 8 expands the discussion of performance loss beyond basic speed shortfalls. It examines how interruptions, minor stops, micro-downtime, variability, and operating practices contribute to lost performance—even when equipment appears to be running continuously.
By broadening how performance loss is defined and observed, this chapter aims to improve how organizations diagnose problems and select effective improvement actions.
[Read more…]
Ask a question or send along a comment.
Please login to view and use the contact form.