Reliability activities serve one purpose: to support better decision making.
That is all it does. Reliability work may reveal design weaknesses, which we can decide to address. Reliability work may estimate the longevity of a device, allowing decisions when compared to objectives for reliability.
Creating a report that no one reads is not the purpose of reliability. Running a test or analysis to simply ‘do reliability’ is not helpful to anyone. Anything with MTBF involved … well, you know how I feel about that.
The Type III Error
A common problem in engineering work is the desire to solve the wrong problem. I know I am guilty of working on the issues that interest me rather than the challenges requiring action. A Type III error is solving the wrong problem.
We only have so much time and resources for reliability work. Sorting out a system’s reliability presents plenty of challenges and interesting aspects. However, the focus on solving the right problems matters. Solving the right issues provides value to the team and organization.
For each action you plan or execute in a reliability program, ask if this is something you are doing because you want to or because it will add value. One trigger to ask this question is the ‘we always do this test’ concept. If you are tempted to build a plan based on previous plans or add activities and tests because, well, you always do those activities and tests, then stop. Stop and think through how those activities and tests will be used, by whom, by when, and to what effect.
Pending Decisions Drive Action
The key is to connect every activity and test to a decision. If you are interested in conducting an ALT on a new technology, is there someone looking for the results of that ALT to inform a decision? In some cases, the decision may be to abandon the new technology if it isn’t reliable enough, or it may mean selecting a different technology.
Once you find the pending decision, then you know who needs the information, the expected quality of the information, and when it is due.
From FMEA (prioritizing work assignment across the team) to field data analysis (do we need design improvements?) each and every action we propose or take must connect to a decision.
MTBF and Decisions
Let’s say we’re asked to create a reliability goal for a new product. Let’s explore who the stakeholders are in this case.
Customers want a reliable product and on occasion may ask for it via MTBF. If asked they actually want something else yet do not know how to articulate it. Providing them a goal stated in MTBF is simply misleading them and allowing poor decision making.
Management wants a reliable enough product to both please customers and avoids undo warranty costs. Again they want a product that lasts a long time with a low failure rate, not MTBF. If asked they may ask for MTBF thinking it is a term used to request reliability information. As you know it’s not, and providing them with MTBF values further confuses their understanding of product performance.
Engineers want to create a product the meets the customers and business expectations, including being reliable. We often break down reliability problems into two groups, early life failures and wear out failures. Neither are well described by MTBF, so don’t, use reliability for the salient time points in your product’s life.
Vendors want to provide the right components to meet the design’s intent. They want to accommodate requests for reliability and often provide MTBF as that seems to satisfy most requests. By asking for reliability, i.e. not MTBF you can learn more about how the component may actually perform in your application.
In each of these cases and in others we’re talking about reliability, thus use the probability of successful operation over a duration for a given function and environment. Help those interested in creating or using a reliable product actually make useful and meaningful decisions with a clear measure.
Summary
The same logic holds for any reliability activity first think about the decisions involved with the results of the activity. Then craft activities that fit within the constraints and deliver suitable results to assist in better decision making.
From goal setting to FMEA’s, to HALT, to field data analysis – if no one is looking for the results, then don’t do it. Help your team by improving the information they have to make decisions.
Larry George says
Thanks for the good advice. Now I don’t have to write about that. Ask, “Why are we doing this?” “Where could we get the same or better information?” “How much would that cost?” How much could we save if we had that information?”
Don’t ask vendors for the field reliability of their products; they probably don’t know. (My experience) Offer to compare the field reliability of their products in your products vs. the population field reliability of their products in all their customers. Ask vendors for their periodic product ships (sales, installed base, etc.) and returns (complaints, failures, even spares sales) from all their customers. That’s data required by GAAP to compute revenue (sales*price) and service costs. Compute nonparametric reliability estimates for vendor products from population data and from your own data. Compare.
Believe it or not, I did this for an HP client.
I also did it for another company and found their parts reliability was somewhat worse than parts vendor’s population reliability. I asked the company chief engineer why? He said their products could be sold for a cheaper price and that they also make revenue and profit off of field service to replace the abused parts. Have it both ways.