Reliability Risk Reduction Tools

Last Verified April 16, 2023

The process to create a new product has risk. There are safety, technical, schedule and financial risks
The risk of a product failing more often then deemed acceptable by the customer or the business falls in the realm of reliability engineering to address.

We can provide insight on what is expected to failure and when. Working with the entire team we can influence the design and assembly process to minimize the risk of field failures.

Below is a short list the most common risk reduction tools in our arsenal.

Design documentation

This not a specific reliability engineering tool yet is often poorly used to document reliability goals. If you have the opportunity add fully specified reliability goals including function, environment, duration and probability.
See my paper Establishing Product Reliability Goals for details.

Design review

Another common activity during the product development life cycle that should include significant reliability attention.

For the peer to peer and technical reviews include questions/discussion of highest reliability risks. Include the responses on FMEA or reliability planning documents or issue tracking systems.
For the management reviews include current reliability status including current reliability goal, estimates, and outstanding known issues. Be clear on what is known and assumed.

Be clear on what is known and assumed.

FMEA

Failure Modes and Effects Analysis is an organized brainstorm activity to capture the potential risk to a design meeting the reliability goals. The FMEA should help the team prioritize and focus attention on resolving the highest risk items.

See 10 Steps to FMEA for a basic overview of the FMEA process.

HALT

Highly Accelerated Life Testing (second worse four-letter acronym in reliability engineering) is a tool to discover the likely failure mechanisms/modes with a design. The intent of HALT is to cause prototypes to fail by increasing stress and/or using multiple stresses.

See Discovery Testing for ideas and approaches to expand the known failures and see Estimating the Value of HALT.

Margin determination

Like discovery testing (HALT and the like) this approach allows the engineers to check their assumptions, models, and calculations with empirical evidence.

The one caution is a single sample is rarely sufficient to understand the process and material variation impact on performance.
A common way this is done is under the guise of environmental testing. Operating the product at the expected temperature extremes, for example.

Rather than only test to a fixed temperature limit or percentage beyond the limits, test to failure and learn something about the magnitude of the margin.

Process capability

Material and manufacturing variability is a common source of poor performing products (i.e. Premature failures). By understanding both the areas of highest risk within a design along with the areas of highest expected variability, we can setup and monitor variation.

If needed conduct life testing at the extremes of the variability to estimate the impact of field reliability.

Monitoring with an eye toward process stability may help reduce the risk of unchecked variability leading to failures.

Customer calls and returns

While not technically a proactive approach, we are rarely able to detect or mitigate every possible failure mechanism.

Any field return, if the organization is prepared to learn from the rerun, is a gift. A gift that may prevent future similar failures.

Before shipping products, list what you expect will fail and why. Then as products come back determine if you reliability assessment was accurate or are you surprised by new failure modes or failures that occur earlier than expected.

Summary

Just a short list of reliability related risk reduction tools. There are others.

Please comment and let’s create a full list of the tools we use to minimize reliability risk.