A life-support-equipment company manager desires to conduct a reliability program assessment. The company is experiencing about a 50% per year failure rate and at least the Director of Quality thought it should do better. One of the findings was related to reliability goal setting and how it was used within the organization.
Nearly everyone knew that the product had a 5,000-h Mean Time Before Failure (MTBF) reliability goal, but very few knew what that actually meant. It was how this team used the product goal that was even more surprising. There were five elements to the product with five different teams working to design those elements: a circuit board, a case, and another three elements. Within each team, team members designed and attempted to achieve the reliability goal of the product, the 5,000-h MTBF goal. Upon data analysis of the field failures, they actually did achieve their goal, as each element was just a little better than 5,000-h MTBF in performance.
Reliability statistics stipulates that in a series system one has to have higher reliability for each of the elements than for the whole-system goal. For example, if each element achieves 99% reliability over one year, the reliability values of product’s five elements, .995, would produce a system level reliability performance of approximately 95% at one year. We call it apportionment when we divvy up the goal to the various subsystems or elements within a product.
This team skipped that step and designed each element to the same goal intended for the system.
Compounding the issue was the simplistic attempt to measure the reliability of the various elements and total lack of measurement at the system level. For each component, the team primarily relied on using the weakest component within the subsystem to estimate the subsystem’s reliability. For example, the circuit board had about 100 parts, one of which the vendor claimed had about a 5,000-h MTBF. Thus that team surmised that, because it was the weakest element, nothing would fail before 5,000 h and thus this was all the information the team members needed to consider. They did not consider the cumulative effective of all the other components nor the uncertainty of the vendors estimate within their design and use environment.
This logic was repeated for each subsystem.
The result was a product that achieved about the same reliability it achieved in the field. The estimated use of the product was about 750 h per year; thus each element would achieve about 85% reliability for a year, which seemed to be an adequate reliability goal. However, this is a series system, meaning that a failure in one element would cause the system to fail. The math works out as follows:
$$ \displaystyle\large R\left( 750h \right)={{\left( {{e}^{-750}} \right)}^{5}}=0.47$$
Because the product of the reliabilities of the individual five elements was overlooked, the system reliability turned out to be less than 50%, not the expected 85%. The field performance was the result of how the product was designed to meet the reliability goal for each subsystem. The team got what it designed. Its members had forgotten or ignored a basic, yet critical element of reliability engineering knowledge.
Leave a Reply