We rely on data to make decisions, to reveal patterns or trends, to learn about our systems and world. Data has many forms and sources. Reliability data may provide what will fail and/or when a device will fail.
Data is rarely complete, accurate or without errors. Get used to it. Finding the sources of errors and creating systems to gather data helps. I find the first time trying to track down the data and do a little analysis is often the hardest part.
Ireson, et. al., (Handbook of Reliability Engineering and Management) lists 10 sources for reliability data.
1. Research tests — often done on new materials or processes to characterize potential reliability failure mechanisms. Some of the results are published in technical conference proceedings or journals. Encourage your research and development teams to conduct these reliability tests as it provides a potential failure mechanism model that will save time later when working toward product launch.
2. Prototype tests — this includes those prototypes that are not specifically in a reliability related evaluation. Any failure is information that is useful when assessing a design’s reliability. Prototypes are expensive and have limited quantities, thus focusing the testing to highest risk areas may generate the most value. Another use is to discover failure mechanisms (HALT is an example of a discovery test).
3. Environmental tests — products have to work in the environment and under use conditions as expected by the customer. Weather (temperature, humidity, rain, salt fog, etc.) and use (cycles/day, load profile, user interface visibility, etc.) can be a long list of specific stress tests. Often these tests are done in sequences on prototypes to minimize the number of units needed for testing. Early tests are non-destructive and have minimum effect or damage related to later stresses in the sequence. For example using UV to check on color changes over time, then doing power line testing to check power supply response to expected variations in the supply of electricity.
4. Development tests — these can take many forms. Generally, these are to answer a question for the development team. For reliability it is often, will this work long enough? These tests should focus the stresses to excite specific failure mechanisms. When done well, these are very designed accelerated life tests. Yet, any testing done during development reveals information about what may fail.
5. Qualification tests — once the development is nearing completion, it’s time to make sure it really works. These tests generally use adequate sample sizes and/or a full range of stresses. Generally, not always, the test design is to show a minimum reliability using a no failure test (minimized sample size and test duration). Tests with failures provide more statistical power and more information and are not common.
6. Reliability growth tests — useful for software and for complex (fail relative often) systems. Generally using the product as intended with as close to use conditions as possible (or very controlled and well understood accelerated conditions), then plotting the cumulative number of fault over time to determine is sufficient improvement in the design has occurred.
7. Production assessment tests — other than individually crafted products the design is destined to be realized in a production facility. As a young manufacturing engineer in a plant, I quickly realized that manufacturing can only make a product less reliable — never (very rarely) as good as the design. The supply chain and production processes introduce variability and potential component and material defects. Again best done to evaluate specific failure mechanisms (or discover failure mechanisms) along with long-term reliability performance evaluation.
8. Acceptance tests — some customers specify a set of evaluations performed prior to agreeing to receive the product. These tests can range from a demonstration to long-term reliability performance. Best to not incur surprises at this stage due to fully understanding the range of potential failure earlier in the program.
9. Tests on purchased items — this is a check on the supply chain. Rarely done, in my experience, unless there is a reason to perform the check as it is often expensive and may cause component damage. Sometimes a necessary evil. Another situation is buying a complete system that you are going to resell under your brand — might be due diligence in this case. A third situation is sampling items for testing to maintain a modicum of monitoring in place.
10. Tests on failed items — At all stages doing a complete failure analysis and testing to validate the FA is essential. Determine the root cause and prove the solution.
And, I would add feedback on failures gathered along the product lifecycle, including
11. Defect tracking during development — the database of discovered failures/faults provide a rich source of reliability data. This is more on the what could fail and on potential failure mechanisms, than able to answer the ‘when will it fail?’ question.
12. Defect tracking during manufacturing — like in development, here the focus is on supply chain and manufacturing variability and the resulting impact on product performance.
13. Call center data — very noisy information as the goal of the call center team is to assist customers, not to start the failure analysis process. With a little work, the call center data can provide insights on the failure symptoms customer are experiencing. Plus there is the ability to request the customer return the product for further study. The data alone would take careful screening to understand.
14. Product returns — This is by far the best data on how your product reliability is doing. Even no trouble found returns provide information is you also consider the overall customer experience. Time to failure, symptoms, and root cause improve this data. Too late for this program unless continuous improve is of value. Very useful for similar products in development.
15. Customer satisfaction surveys — not done in all organization or industries, yet feedback from your customers is often a true indicator of satisfaction with product reliability. Work with your customer facing part of your organization to get better reliability related information.
Summary
There is potentially a lot of data. Not all of it easy to retrieve or use, yet all with information useful to guide the creation and maintenance of reliable products.
Related:
Field Industry and Public Failure Data (article)
Types of Failure Data (article)
Reliability Block Diagrams Overview and Value (article)
Fred Schenkelberg says
thanks for the comments