How to Assess Your Reliability Program

“How do you know so much about our program?” was a question the quality manager asked after reading the assessment report. The assessment took one day with eight interviews.

The reliability that results is going to happen whether or not the team designing the product or production line deliberately use reliability engineering tools or not. The elements of a product or system will respond to the environment and either work or fail.

While working at Hewlett-Packard I had the opportunity to conduct reliability program assessment of about 50 product divisions. One hypothesis related the number of reliability tasks the team actively used would correlate to their warranty expenses.

That worked to a point.

Use the right tool at the right time

The teams that did not understand basic tools and did no overt or organized reliability engineering had high warranty expenses (as percentage of revenue). The teams that did a large number of tasks (FMEA, HALT, ALT, Predictions, etc.) did have lower warranty expenses.

The surprise was the teams that had the lowest warranty expenses also did very few reliability activities. The difference was the best performing teams understood the range of available reliability engineering activities and only used the tools that would provide value for a given circumstance.

A Less mature organization would attempt to do as many reliability related activities as possible, including a long list of product tests of which many provided little actual value.

It was the application of the right tool at the right time that made the difference.

Reliability maturity matrix from FMS Reliability

You may also download a pdf of the matrix if you are a member of the site, which is free, by the way.

Maturity and Activity

Hiring a reliability engineer or running a lot of life tests does not improve your product’s reliability performance. It is not the organization or activities that you call a reliability program, rather your reliability performance relates to the culture concerning reliability.

Reliability occurs at the point of decision. Therefore, during interviews, the intent is to understand how decisions are currently accomplished. To what extent does reliability considerations influence decisions and what tools or methods are used to form decisions.

To what extent do reliability considerations influence decisions and what tools or methods are used to form decisions?

Understanding reliability application

For example, if we ask, “To what extent do you do HALT?” (HALT being Highly Accelerated Life Testing). In two circumstances the answer may be, “We rarely use HALT.”

In one case, it may be the engineer doesn’t know what HALT is and isn’t sure if the testing they accomplish is similar to HALT or not. Or, they don’t do HALT as they are unfamiliar with that type of testing.

In another case, the engineer knows about HALT, how and why it is used. Then says they have rarely used HALT because they have not had appropriate situations to do HALT. They understand that it is a useful tool for specific applications and recently have not needed to conduct HALT.

Some respond that they do HALT. Again there are two common responses. In one case the team does HALT all the time because it is required, independent of whether it may be useful or not. In the other case, they do HALT as it is the right tool for the current situation.

One team didn’t know what HALT was and the other fully understood and choose to not do HALT. The difference is the understanding and application, or maturity.

Assessment Process

To understand how an organization’s reliability maturity, use the following assessment process.

1. Select survey topics. This is a list of activities and tools common to reliability practices in your industry. It may include items rarely used. It should include the breadth of topics related to reliability in your field. See the DFR Methods Survey for one possible list of topics.

DFR Methods Survey 2014 from FMS Reliability

Some topics are broad, such as on ownership/responsibility of product reliability, or reactive or proactive approach of management. Some topics are very specific, such as specific tools like FMEA or HALT.

2. Establish the interview format. One to one, small group, via phone, invited survey with follow-up conversations, or some other method. I have found the one-to-one discussions the most useful as it permits immediate follow-up and exploration of the rationale or motivation behind specific behaviors or responses.

3. Conduct the interviews (collect information). Arrange to interview or survey a cross-section of people in the organization. Select individuals with experience with the organization and products are typically designed and manufactured. My recommended list of position titles comprises:

Design & development engineers (electrical, mechanical, software)
Design& development managers (electrical, mechanical, software)
Reliability or Quality engineers and/or managers
Procurement engineering (someone that works with suppliers)
Manufacturing engineering and/or managers (design for manufacturing, sustaining and/or production engineering)

Select about 8 individuals for interviews, more or less depending on the specific situation, size, complexity, etc of the program.

In general, the interview question is based on the phrase, ’to what extent’. For example, you might ask, “To what extent do you use HALT?” Depending on the response you may explore the motivations or rationale behind the decision both to conduct HALT, and how the HALT results are used within the organization.

4. Document the business environment – volume, cost, brand position, revenue, the cost of unreliability as percentage of net revenue, etc. Document any regulatory or customer imposed restrictions or requirements. Summarize to convey the atmosphere around the reliability program.

5. Document the collected information. A summary back to participant asking for additional input or corrections helps with the acceptance of the assessment results, plus may help avoid a mistake in your understanding.

6. Analyze the data. This is not done during the interviews – let them talk. Review the notes and information provided and map to the maturity matrix. Look for consistent approaches to making reliability related decisions. Look for patterns of behavior and underlying motivations or causes.

7. Report on assessment findings. Document and explain what you heard and how it related to the overall organization’s maturity. The report may include the interview summary, strengths, weaknesses, and recommendations for improvement.

Summary

The assessment process should provide a view of the overall organization’s approach to making decisions and to what extent and how their reliability program influences those decisions.

With that basic understanding, you can identify strengths to build upon, weaknesses that need attention, and recommendations to improve the maturity of the reliability program.

Comments

Peter de Place Rimmen says
January 8, 2016 at 12:08 AM
Hi Fred
It’s good information for inspiration in your articles.
I have just a small comment about HALT. I think it make more sense if the abbreviation is covering the words Highly Accelerates LIMIT Test. HALT is testing the Robustness and not the Life! 
Best regards
Peter
- Fred Schenkelberg says
  January 8, 2016 at 10:44 AM
  HI Peter, thanks for the comment and I like your idea and agree that HALT does explore limits, not life. cheers, Fred