Purpose of a Reliability Program
The reliable performance of a system is important. It is important to the customer, to our business and to us.
Very few argue that we should ignore the reliability characteristics of a product. We also deem cost, time to market or feature set as important also. The trouble is we can measure the latter directly every day, where the reliability performance is often difficult to measure.
Without feedback on progress toward our goals, we tend to make decisions less on reliability and more on the priorities we can measure. To ensure decision making include suitable reliability performance information is the purpose of your reliability program. To enable informed decisions across your organization
There are three elements we need to include sufficient reliability information to decision makers:
- What will fail?
- When will it fail?
- Cost (impact) of failure.
Let’s briefly explore each of these in the context of a reliability program and how this information influences decisions.
What will fail
Design engineers naturally consider the potential use of a device and the likely weaknesses of the design.
The intent is to provide the desired functions without failure. Even for relatively simple products, there are hundreds of potential failure mechanisms competing to cause the device to fail.
The reliability engineering tools of risk prioritization and discovery combine with engineering knowledge to identify the most serious and likely failure mechanisms. The reliability program should include a culture that enjoys and rewards finding failures. It is with knowledge of failures that the decisions to make improvements can occur.
Product testing should focus on testing to failure and thorough failure analysis. It is the detailed information on the root cause of failures, that provides the necessary information to consider appropriate remedies. Without failures the design team is left to speculate which design elements may require improvement. In some cases, modeling and simulation replace prototype testing, which should be supplemented with material and component level reliability characterization.
It is the awareness of what may fail, along with tangible evidence of what actually fails, the starts the discussion on reliability improvements.
The reliability program supports creating detailed knowledge of the failure mechanisms with infrastructure (tools and methods). The work to find failures requires a culture where finding failures is a desired and celebrated activity.
When will it fail
Knowing what will fail may provide too many opportunities to improve the reliability performance of the design.
One key element of information is the expected time before the failure mechanism will lead to failure. A faulty assembly or inadequate design may lead to early life failures in every device. Or a selected component may degrade in performance slowly leading to failure after many years of operation.
Knowing when a failure will occur provides information enabling prioritization of which failure mechanisms require attention. Some may require changes in materials or components, some may require adjustments to the assembly process, and others may require addressing diagnostic and repair capabilities.
The reliability program should support the exploration of the failure mechanisms time to failure behavior.
This includes modeling, simulation, environmental and use characterization, along with accelerated life testing for materials, components, and assemblies. The program infrastructure includes a mix of modeling and laboratory resources. The program enables the team to estimate the time to failure distribution for the most likely failure mechanisms.
The reliability work reduces uncertainty concerning when failures may occur or under what scenarios failures may occur.
Cost of failure
Not all failures are the same.
Some cause catastrophic damage other will lead to no perceptible change in product performance. When a failure occurs may also change the resulting cost of the failure. Early failures often damage brand reception and may lead to reduced market acceptance. Failure during the warranty period increases financial obligations. Failures at any time increase the cost of ownership of the customer.
Ideally, the reliability program includes the infrastructure and support to estimate the total cost of a failure.
The cost information should range from the direct cost of the failure (reset time of a system, component replacement, system replacement) to the impact of the customer (lost sales, for example) and the impact on customer satisfaction. In some markets, the increased risk of liability or recalls is also a factor.
The reliability program creates both the ability to estimate the costs and to place the risks in an appropriate content and perspective.
If there is a finite chance of an expensive failure, the information should be clear and useful for the decision makers to evaluate.
It is the combined knowledge of failure mechanisms, time to failure probabilities along with financial impact estimates that create clear information for decision makers across the organization.
Knowing only one element is not sufficient to properly balance the risks and options to make the right decision.
Each product and market is different, yet the ability to make informed decisions considering the reliability performance of your product does not. The reliability tools, methods, and culture employed are a direct result of the reliability program.
Keep in mind that each element of your program has a role to play in address one or more of the three elements discussed above.
When all three elements are regularly discussed and addressed by teams across the organization then your program will be successful.