Reliability Engineering is About Answering Questions
Engineers solve problems. We optimize solutions.
Engineering starts with a question. The work of engineering is answering those questions. Can we create an antenna with enough range? How can we make a safe autonomous driving car? How much can a delivery drone carry if it has a range of 100 miles?
Reliability engineers are no different. We ask questions and work to answer them. To solve the problems in the pursuit of providing our customers reliable solutions.
In general, there are only a few types of questions a reliability engineer addresses: What will fail, when, and what is the impact of a failure.
The answers are used to design reliable products, optimize supply chains and assembly processes, refine warranty accruals, and identify significant business risks.
1 – What Will Fail?
My favorite way to understand a product is to break it. What can and what can it not do? What happens when it gets wet? How well does it perform when I wear gloves? Etc.
Engineers can solve problems when they know the details of the problem. Knowing how a product fails permits creating a more robust product if desired. Just working to create a ‘more robust or more reliable’ product is a collection of opinions and educated guesses on what will achieve the goal.
Knowing what fails and under what conditions provides the necessary details to craft solutions.
Knowing how a product fails includes knowing if the failure leads to safety problem or not. It includes knowing if the design, component variability, process control, or other solutions is likely to provide a solution.
To answer the question of ‘what will fail’ we use tools like FMEA, HALT, and environmental testing. We capture information on simulation weaknesses, prototype failures, assembly line and component variations, results of vendor, internal and customer testing, customer site testing and use.
Potential and actual failures reveal what will fail.
2 – When Will It Fail?
Once we understand what will fail the next question involve when will that failure occur. How long before these solder joints craft? How long before corrosion leads to failure of this support? How long before this material wears away to point defined as a failure?
Consumer products last 5 years, cars 10 years, solar panels 25 years. If too many product failures occur within the warranty period or too soon according to customer expectations, that is not good for your organization or brand.
If you have reliability goals, and you should, then it is natural to compare the estimates of time to failure for your product with those goals. Will the field failure rate be low enough to meet our projected warranty expenses? Will customers experience the expected low cost of ownership due to few if any system failures?
During concept and early design product development the estimates concerning when will it fail include:
- Engineering judgment
- Similar product/subsystem comparisons
- Physics of failure models and simulations
During lifecycle stages with prototypes, the estimates may include the use of:
- Reliability Life Testing
- Environment/Stress Testing
- Accelerated Life Testing
During the production phase, we use field data analysis to finalize our estimates of when the product will fail.
Knowing when something will fail helps to know if it’s reliable enough, yet.
3 – What is the Impact of a Failure?
Not all failures are the same. Sometimes, what is technically a failure goes unnoticed or is so minor to not warrant any action to address the failure. Sometimes a failure causes catastrophic damage and loss of life.
Knowing what and when something will fail then often requires us to address the ‘so what’ question. What happens if this fails? What is the consequence to us, our organization, the customer, their organization, and society?
The cost of a failure is not simply the cost to replace or repair the faulty item. Your customers use your product to provide a function, to solve a problem, to accomplish their business or home tasks. Consider the potential impact of a smoke alarm not functioning when it should during the initial stages of a house fire.
Consider what happens when your product doesn’t work as expected, for as long as expected, or reliably? At some point, the accumulation of failures erodes customer confidence in the solution, and in your brand’s ability to create a valuable product. Sales become more difficult and expensive, market share erodes, etc.
Another consideration around the impact is safety. Even one failure that is a serious safety hazard is one too many.
Also, consider the magnitude of the potential failures. Is the failure likely to occur in just a few isolated cases or situation, or will this failure occur in most or every product? A relatively minor annoyance the occurs on every product every day could be a major problem. Product returns that occur twice of often as expected may destroy profitability.
Consider the impact of a failure including the financial and customer satisfaction impacts.
This information along with what will fail and when will it fail permit you and your team to address the critical few problems to make significant strides toward delivering a reliable product to the market.
Focus on Answering the Question
Everything we do as a reliability engineer should add value. By focusing on the questions and answering them well, we add value.
Our work enables the entire team to address and solve problems that affect product reliability. By addressing the what, when, and impact questions, and helping our teams ask these questions, we instill a culture of reliability across the organization.
How does your organization ask these fundamental reliability questions? How do they get the information necessary to answer them?
Also published on Medium.