Why do we talk about reliability?
- To make decisions
- To estimate reliability
- To understand risk
We talk about reliability because it matters. The ability to estimate reliability allows us to make design and development decisions. The ability to monitor reliability allows us to adjust the design, suppliers or expectations about a product.
We talk about reliability as a consumer, as we want sufficient return on our purchase or investment. We want the product to function over time as expected. We talk about reliability because we know failure can and will occur at some point. We are willing to risk failure if it’s sufficiently small for the investment.
We talk about reliability to make decisions. During the selection of materials, design engineers judge the durability of different materials for use in the application and environment. Before the launch of a consumer product, we ask if the probability of failure during the warranty and useful life period will be low enough to begin production and shipments (high failures rates may erase any potential profits).
In many development projects we consider the cost, function and time. Teams often prioritize these three considerations to assist in the many design decisions made across the team. Reliability is part of the decision process and when known (estimated or measured) it has significant influence.
The best time to design an accelerated life test is after all of the units have operated and failed. We then know the product life duration and failure mechanisms. It is generally not viable to start producing a product with an unknown reliability performance.
Design engineers tend to design away from failure, yet they need to know the target, expected failure mechanism, and the use environment. Reliability as a measure provides this information. We use a range of estimation tools to provide the value for the reliability measure.
Estimations may range of engineering judgment to detailed physics of failure models. The ability to describe the estimated reliability enables meaningful decision making and provides guidance to the team designing or using a product.
Business decisions often involve risk. Risk of too many product failures, risk of competitors creating better products, and the risk of harming customer satisfaction. There are other risks, yet reliability provides significant risk as it’s both hard to measure accurately and largely determined by the customer’s perception of failure.
Reliability isn’t just will the product work over time. It includes customer defined failures such as:
- Wrong color
- Difficult to use or maintain
- Makes an annoying sound when operating
- Consumes too much energy
- Fails to function
Product reliability failures incur expense to the organization for any of these failures. A call to customer service or returned product, or the lack of product word-of-mouth recommendations all cost the organization money. All are product reliability failures.
The risk is not confined to just the eventual wear out of the device.
The measure and discussion about reliability includes the discussion about risk of product failure.
MTBF and Reliability
We use MTBF to talk about reliability. We see it on data sheets and use the term in discussion when evaluating supplied components. We use MTBF when calculating availability. And, we hear others request, “What is the MTBF?” on a regular basis.
We use MTBF because its common and available.
Is it the best choice?
Use reliability directly as it is what we need to know to make decisions, estimate performance over time and evaluate risk. Use the full definition of reliability and not just the inverse of the failure rate.
Use function, environment, probability and duration when talking about reliability. Use the best available tools to estimate and describe risks. MTBF by itself is difficult to understand, whereas saying 98% of units will survive 2 years implies that 2% will fail over that period. Add the cost per failure and we’ve just enabled the team to make meaningful decisions.
MTBF is not reliability it is only one element (probability) and without a duration is actually meaningless on its own.
We talk about reliability to improve products, performance, and make decisions along the way. Using the best available tools help us and others make decisions accurately.