Accendo Reliability

Your Reliability Engineering Professional Development Site

  • Home
  • About
    • Contributors
    • About Us
    • Colophon
    • Survey
  • Reliability.fm
    • Speaking Of Reliability
    • Rooted in Reliability: The Plant Performance Podcast
    • Quality during Design
    • CMMSradio
    • Way of the Quality Warrior
    • Critical Talks
    • Asset Performance
    • Dare to Know
    • Maintenance Disrupted
    • Metal Conversations
    • The Leadership Connection
    • Practical Reliability Podcast
    • Reliability Hero
    • Reliability Matters
    • Reliability it Matters
    • Maintenance Mavericks Podcast
    • Women in Maintenance
    • Accendo Reliability Webinar Series
  • Articles
    • CRE Preparation Notes
    • NoMTBF
    • on Leadership & Career
      • Advanced Engineering Culture
      • ASQR&R
      • Engineering Leadership
      • Managing in the 2000s
      • Product Development and Process Improvement
    • on Maintenance Reliability
      • Aasan Asset Management
      • AI & Predictive Maintenance
      • Asset Management in the Mining Industry
      • CMMS and Maintenance Management
      • CMMS and Reliability
      • Conscious Asset
      • EAM & CMMS
      • Everyday RCM
      • History of Maintenance Management
      • Life Cycle Asset Management
      • Maintenance and Reliability
      • Maintenance Management
      • Plant Maintenance
      • Process Plant Reliability Engineering
      • RCM Blitz®
      • ReliabilityXperience
      • Rob’s Reliability Project
      • The Intelligent Transformer Blog
      • The People Side of Maintenance
      • The Reliability Mindset
    • on Product Reliability
      • Accelerated Reliability
      • Achieving the Benefits of Reliability
      • Apex Ridge
      • Breaking Bad for Reliability
      • Field Reliability Data Analysis
      • Metals Engineering and Product Reliability
      • Musings on Reliability and Maintenance Topics
      • Product Validation
      • Reliability by Design
      • Reliability Competence
      • Reliability Engineering Insights
      • Reliability in Emerging Technology
      • Reliability Knowledge
    • on Risk & Safety
      • CERM® Risk Insights
      • Equipment Risk and Reliability in Downhole Applications
      • Operational Risk Process Safety
    • on Systems Thinking
      • The RCA
      • Communicating with FINESSE
    • on Tools & Techniques
      • Big Data & Analytics
      • Experimental Design for NPD
      • Innovative Thinking in Reliability and Durability
      • Inside and Beyond HALT
      • Inside FMEA
      • Institute of Quality & Reliability
      • Integral Concepts
      • Learning from Failures
      • Progress in Field Reliability?
      • R for Engineering
      • Reliability Engineering Using Python
      • Reliability Reflections
      • Statistical Methods for Failure-Time Data
      • Testing 1 2 3
      • The Hardware Product Develoment Lifecycle
      • The Manufacturing Academy
  • eBooks
  • Resources
    • Accendo Authors
    • FMEA Resources
    • Glossary
    • Feed Forward Publications
    • Openings
    • Books
    • Webinar Sources
    • Journals
    • Higher Education
    • Podcasts
  • Courses
    • Your Courses
    • 14 Ways to Acquire Reliability Engineering Knowledge
    • Live Courses
      • Introduction to Reliability Engineering & Accelerated Testings Course Landing Page
      • Advanced Accelerated Testing Course Landing Page
    • Integral Concepts Courses
      • Reliability Analysis Methods Course Landing Page
      • Applied Reliability Analysis Course Landing Page
      • Statistics, Hypothesis Testing, & Regression Modeling Course Landing Page
      • Measurement System Assessment Course Landing Page
      • SPC & Process Capability Course Landing Page
      • Design of Experiments Course Landing Page
    • The Manufacturing Academy Courses
      • An Introduction to Reliability Engineering
      • Reliability Engineering Statistics
      • An Introduction to Quality Engineering
      • Quality Engineering Statistics
      • FMEA in Practice
      • Process Capability Analysis course
      • Root Cause Analysis and the 8D Corrective Action Process course
      • Return on Investment online course
    • Industrial Metallurgist Courses
    • FMEA courses Powered by The Luminous Group
      • FMEA Introduction
      • AIAG & VDA FMEA Methodology
    • Barringer Process Reliability Introduction
      • Barringer Process Reliability Introduction Course Landing Page
    • Fault Tree Analysis (FTA)
    • Foundations of RCM online course
    • Reliability Engineering for Heavy Industry
    • How to be an Online Student
    • Quondam Courses
  • Webinars
    • Upcoming Live Events
    • Accendo Reliability Webinar Series
  • Calendar
    • Call for Papers Listing
    • Upcoming Webinars
    • Webinar Calendar
  • Login
    • Member Home
Home » Articles » The Challenges in Reliability Engineering

by Fred Schenkelberg Leave a Comment

The Challenges in Reliability Engineering

The Challenges in Reliability Engineering

What are the Other Challenges in Reliability

Creating a product or system that lasts as long as expected, or longer, is a challenge.

It’s a common challenge that reliability engineering and the entire engineering team face on a regular basis. It’s also not our only challenge.

We face and solve a myriad of technical, political, and engineering challenges. Some of our challenges are born and carried forward by our own industry. We have tools suitable for a given purpose altered to ‘fit’ another situation (inappropriately and creating misleading results). We have terms that we, and our peers, struggle to understand.

Sometimes, we, as reliability engineers, have set up challenges that thwart our best efforts to make progress.

Let’s examine a few of the self made challenges and discuss ways to overcome these obstacles permitting us to tackle the real hurdles in our path.

MTBF and Prediction are The Two Big Issues

This site has the expressed goal to ‘eradicate MTBF’. It is the worst four-letter acronym in our world. You already know this and so many of the readers here have taken steps to see this term relegated to the dust of forgotten history.

Parts count predictions, especially from our favorite military standard, are widely known to be less than useful. Then why do we continue to find requirements to use this method as a basis to estimate actual future field failure rates?

Even 20 years after 217’s retirement/obsolescence, it lives. Again, there are teams working on viable and actually useful alternatives. Physical of failure modeling, improved reliability modeling tools that permit (nay encourage) the use of appropriate life time distributions, and other work is slowly weaning our industry from the folly of parts count predictions.

HALT: “Let’s pass HALT”

This one isn’t discussed too often. Yet, have you heard someone wonder if their product could pass HALT?
How about, ‘of course it failed you were testing above the specified use level..’

HALT is the second-worst four-letter acronym.

We have a ways to go to make this basic concept clear. We will employ a stress testing process to identify weaknesses in the design. We are going to use elevated stresses to discover problems and margins quickly.

Cost of Failure

Engineers know intuitively that failures are bad. The design effort includes actions to design a robust and reliable product.

One tool that we often avoid employing is the actual or estimated cost of a failure. We tend to focus on failure rates and failure mechanisms, which is fine to a point. Yet, if we do not also include the consequence (safety, warranty, brand loyalty, customer losses, etc.) we only enjoy half the information we need to enable great decisions.

Our team needs to work on the potential and actual failures that make a difference when solved. Not all failure modes are the same. Let’s solve the ones that save the most lives, anguish, and money.

Get the information you need for your product to determine the cost per failure. This information along with a expected shipping volume and estimated failures rates enables the calculation of the cost of failure.

If you calculate the cost of failure per unit shipped, you have a value that is comparable to the bill of material cost of the materials and components in a product. In my experience, the cost of failure per unit shipped is the most expensive or within the top 5 most expensive components in a product.

We employ teams of engineers to develop a single critical component, to cost reduce an expensive component, and our ignorance allows wonderful opportunities for savings to remain hidden.

Determine the cost of failure and make that information widely available to your team. Show them how to use the information to weigh the everyday decision they make during design and development.

Mixed Priorities

I’ve been told product reliability is critical than asked to use less then half the sample size necessary for an accelerated life test.

Critical, important, and top priority are great terms. They sound great. If they do not come with resources, personnel, budgets, and support, those terms are hollow platitudes suggesting our work on reliability is critical, important, or a top priority.

I’m not suggesting, although often really do believe, reliability performance is a top priority. Organization have many priorities and I get that. The challenge is in the mixed signals. The unclear priorities. The many top priorities.

The remedy is to quantify the cost of failure again—management, mostly, talks in terms of money. So, we need to convert a 1% failure rate into dollars lost to warranty per year. We need to quantify the cost of uncertainty, especially when the uncertainly ranges from none to billions in potential losses. A 10% chance that we have a major safety issue for a $100 million product line suggests the likely loss is $10 million unless we reduce the risk. Few other product risks involve such threats to profit and business viability.

Part of why reliability isn’t well positioned in the pantheon of priorities is it is difficult to quantify. At least that is my observation. Difficult doesn’t mean impossible.

Reliability is one of the most important priorities for most organizations to get right. Let’s help our teams align the ability to deliver the expected reliability to achieve the goals, while properly balancing with other priorities.

Summary

There are challenges in the world of reliability engineering. MTBF and predictions are well known and many are working to help us and our peers move forward.
HALT, Cost of Failure, and Mixed Priorities are 3 of the many challenges you face on a regular basis. What would you add to this list? How can we, as a community of reliability engineers do to solve them? Add you suggestions and recommendations in the comments section below.

Filed Under: Articles, NoMTBF

About Fred Schenkelberg

I am the reliability expert at FMS Reliability, a reliability engineering and management consulting firm I founded in 2004. I left Hewlett Packard (HP)’s Reliability Team, where I helped create a culture of reliability across the corporation, to assist other organizations.

«  How to Make RCFA a Successful Business Improvement Strategy 
Statistical Tools most Frequently used During Product Validation. »

Comments

  1. Rick Kossik says

    April 20, 2017 at 9:12 AM

    I believe your most important point here is that proper design requires not just modeling failure (and repairs), but modeling the consequences of failures. This is a much more difficult task than traditional reliability modeling, as it requires a “total system model” that not only simulates the components that can fail (and perhaps be repaired), but also models (in detail) the consequences of different types of failures. Only then is it possible to focus on the failures that are important.

    A simple example of this is a water resource system. If a pump fails, how does it affect the rest of the system? Is the failure simply an inconvenience or does it lead to catastrophe (e.g., a dam failure)? Perhaps usually it is just the former, but if it fails during a storm event, it could be the latter. Moreover, although storms may be rare, the pump may in fact be more likely to fail during a storm (i.e., failure rates may increase during storm events), and this should be quantitatively represented in the model. So to properly understand the consequences of failure requires that you model the total system (dynamically and probabilistically), representing, for example, storm events, as well as the actual feedback loops that exist in the system.

    A few of our customers have done this, including NASA, Sandia National Laboratories, and Los Alamos National Laboratory, but it is the exception, not the rule. I think the primary reason is that doing so requires a team approach. Most reliability engineers lack the background to model the “total system”, and those with the background typically lack the required reliability engineering skills. Hence, modeling such a system properly requires a team of individuals who together possess the necessary skills. This can be time-consuming and expensive, and hence is not typically done (of course, the ultimate cost of failure may be much more expensive, but this is rarely taken into account).

    Reply
    • Fred Schenkelberg says

      April 20, 2017 at 10:53 AM

      Thanks Rick for the comment and story. As you suggest these models can become rather complex, yet even considering the consequences will go a long way to help sort out priorities. cheers, Fred

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

The NoMTBF logo

Devoted to the eradication of the misuse of MTBF.

Photo of Fred SchenkelbergArticles by Fred Schenkelberg and guest authors

in the NoMTBF article series

Recent Posts

  • The Rivian Paradox
  • You Don’t Need (More) Reliability Engineers
  • Listening with Intent: The Missing Skill in Design Thinking
  • 10 Reasons to Avoid MTBF
  • MTBF, who are you?

Join Accendo

Receive information and updates about articles and many other resources offered by Accendo Reliability by becoming a member.

It’s free and only takes a minute.

Join Today

© 2025 FMS Reliability · Privacy Policy · Terms of Service · Cookies Policy

Book the Course with John
  Ask a question or send along a comment. Please login to view and use the contact form.
This site uses cookies to give you a better experience, analyze site traffic, and gain insight to products or offers that may interest you. By continuing, you consent to the use of cookies. Learn how we use cookies, how they work, and how to set your browser preferences by reading our Cookies Policy.