Reliability engineering is a blend of disciplines from material science to asset management. We use problem-solving, design, maintenance, and statistical tools on a regular basis, yet that is not the only thing we do.
Having met a few engineers that define their role as a reliability engineer as conducting HALT or FMEA only, strikes me as to what most believe we do, or should do, as a reliability engineer. It is true that someone may specialize by choice or chance on one tool, yet even then is that all they do?
I don’t think so.
A Typical Day
Let’s follow a couple of imaginary engineers through their typical day. Meet Bill and Sue, they work at different companies that each make a range of similar products. Bill and Sue work with multiple design and manufacturing teams as the assigned reliability engineer.
Bill has a HALT focus
Bill found that HALT (Highly Accelerated Life Test) testing early in a program has been very successful at finding
design defects and provided a focus for manufacturing improvements, thus he regularly recommends and conducts HALT. At the start of his typical day, he checks his messages and finds three requests from three teams he supports.
- Provide an update on Shasta testing – is it ready to start shipping?
- Draft a reliability and environmental test plan for Whitney – how many samples?
- Any input on the reliability goals for the new program Hood?
For the first request, he reviews the three rounds of HALT testing and the plans for HASS (Highly Accelerated Stress Screening), creates a short summary and replies that the product is ready. When asked what the expected field failure rate will be, Bill says HALT does not provide that information, yet he is confident that the product will be as reliable or better than existing products given the team’s response to HALT findings.
Later in the morning, Bill attends the Whitney team meeting with an outline of the testing plan. He recommends multiple rounds of HALT and a few extra units for specific environmental testing done in a HALT fashion. He requests a total of ten prototypes. The Whitney team lead wants to be sure the testing includes any new anticipated stresses since this is the first outdoor environment for one of their products. After some discussion, Bill has the assignment to investigate methods to conduct margin testing for rain, snow, and ice.
After some discussion, Bill has the assignment to investigate methods to conduct margin testing for rain, snow, and ice.
After lunch, the program manager for the Hood program calls Bill for input on the product goals. Since Bill tends to spend most of his time in the HALT lab he hasn’t read the marketing or financial reports, nor the draft reliability goal statements. He doesn’t need those values for his work, so tends to ignore them. He says the Hood goals are fine by him.
The rest of the day is spent conducting HALT and drafting reports on the results.
Sue has a customer focus
Sue found that helping her teams make good decisions while considering reliability related information helped the team create reliable products. She found that a range of tools provided a range of solutions, and fitting the right tool for a given situation often yielded the desired result.
Like Bill, Sue started her day checking messages. She too received three requests.
- Provide an update on Columbia testing – is it ready to start shipping?
- Draft a reliability and environmental test plan for Rogue – how many samples?
- Any input on the reliability goals for the new program Klamath?
For the first request, she reviews the progress the team has made identifying risks using FMEA and HALT, then the progress in collecting and creating the necessary time to failure information for the system reliability block diagram, and reviewed the shortlist of critical to reliability items the supply chain and manufacturing teams will need to monitor.
The team already knew the product’s latest reliability estimates and Sue took the action item to review the failure rate projects with the finance team as they established warranty accruals.
Later in the morning, at the Rogue team meeting, Sue lead the discussion on the expected test plan and the major decision points that may alter the plan based on the results of the risk analysis tools (FMEA and HALT). Given a new technology, being used for the first time on Rogue, they discussed the need to characterize the time to failure distribution or model for the new style of solder joint attachment. The team needed to know if the new idea was reliable enough for use in the product by April 1st, so they had time to pursue an alternative route just in case. Sue spent the rest of the morning drafting an accelerated life test plan that would meet the timeline, budget, and accuracy required for the team to make the right decision.
Just after lunch, the program manager for the Klamath project asked for inputs on the goal statements for the new project. Considering the goals and apportionment of the goals guided the entire team, she spent the next half hour with the program manager reviewing the four elements of the goal in detail. They reviewed the functional requirements of the document and agreed on the primary (most important) functions to provide a focus for reliability evaluations. They discussed what was known and unknown about the expected use environment and conditions, including the expected rate of use. This section would require more information thus Sue took the action item to work with customer service and marketing teams to find additional information.
Then they reviewed the three couplets of probability of success and durations. Senior management wanted fewer early life failures (failures in the first three months) as a recent product had higher early failures and a noticeable impact on sales due to word of mouth recommendations.
They set an aggressive goal and quickly outlined steps to highlight steps to achieve the goal. Then they discussed the project’s overall business plan, including pricing and the cost of service calls and replacements. Given the overall business objectives and customer expectations concerning warranty, they set a couplet of reliability and duration related to the warranty period.
Finally, they discussed how long customers generally expect to use their products and the relationship between long-term reliability and brand loyalty. Also considering the technology planned for the new product they set a reliability target and duration. They noted that elements of the new design may require long-term reliability analysis given their current uncertainty.
Summary and Results
In both cases, Bill and Sue served their project teams well and all teams delivered reliable products to the field. Bill was sought after for HALT testing advice and gladly helped promote this useful tool to improve reliability performance. Sue found she was asked about service plans, design aspects related to product ease of use, and financial modeling use of reliability information. They both spent plenty of time in the lab evaluating products, finding opportunities for improvement and influencing their teams.
Working with the team
In both cases, whether focusing on one tool or a range of tools, the key element is they worked with teams to help them make decisions. They presented clear results and regularly asked the ‘what if’ and ‘what is the risk’ type questions. They both guided their teams to create reliable products.
Just using tools, whether HALT or RBD modeling limits the value of a reliability engineer. They need to be part of the team, engaged with decision making and sought after for advice.
Sure, I prefer the range of tools approach as illustrated by Sue, yet Bill’s approach works just fine. Sue enjoyed a broader scope and additional influence as she used tools from failure analysis to financial modeling.
There is no one way to run a reliability program or be a reliability engineer. There are many paths to creating a reliable product. To a large
To a large degree, it’s not the tools we use, it is the influence that we wield that makes the difference.