Understanding RPN Limitations – Problems and Solutions

In this week’s FMEA problems and solutions article, the intermediate problem challenges readers to prioritize a series of RPNs (with their corresponding S, O, and D). In the advanced problem, readers are asked to weigh in on a fictitious debate between advocates of traditional RPN, and advocates of criticality assessment, using only severity and occurrence.

If you haven’t yet read the article titled “Prioritizing risk for corrective actions in an FMEA – Know before you go!“, you can access it by clicking on the link.

Beginner’s Problem

In an FMEA, which of the following is true about “Risk Priority Number (RPN)”? (Select the best answer.)

1. An “RPN” is the sum of Severity, Occurrence, and Detection rankings.
2. An “RPN” is the product of Severity and Occurrence rankings.
3. An “RPN” is the product of Severity, Occurrence, and Detection rankings.
4. None of the above.

Beginner’s Solution

1. An “RPN” is the sum of Severity, Occurrence, and Detection rankings. (False. An “RPN” is the product of Severity, Occurrence, and Detection rankings, not the sum.)
2. An “RPN” is the product of Severity and Occurrence rankings. (False. An “RPN” is the product of Severity, Occurrence, and Detection rankings.)
3. An “RPN” is the product of Severity, Occurrence, and Detection rankings. (True)
4. None of the above. (False)

Intermediate Problem

You are performing an FMEA on a bicycle brake cable. The team has identified two failure modes for one of the primary functions, with two causes for each of the two failure modes (see illustration).

This results in four RPNs:
A) RPN 42 (S = 7, O = 3, D = 2)
B) RPN 140 (S = 7, O = 5, D = 4)
C) RPN 200 (S = 10, O = 5, D = 4)
D) RPN 40 (S = 10, O = 2, D = 2)

Using the letters A, B, C, D, what is the priority sequence for addressing issues in this FMEA excerpt?

Intermediate Solution

C, D, B, A
Recall the rule that FMEA teams should always address high severity first, regardless of RPN value. Therefore, the first priority is severity 10, RPN 200, which is letter C. The next priority is severity 10, RPN 40, which is letter D. Once the high severity issues area addressed, the team can take up high RPNs. The next priority is RPN 140, which is letter B. If the team wishes, it can finally address RPN 42, which is letter A.

Advanced Problem

A debate is raging in your company about the use of RPN. One side wants to adhere to the AIAG/SAE standards that recognize three risks: severity, occurrence and detection, and the resulting RPN value. The other side wants to limit risk characterization to severity and occurrence, instead using “criticality” (product of severity and occurrence).

Summarize briefly the pros and cons for both approaches.

Advanced Solution

There is no perfect answer to this problem. This debate has been going on for many years in the FMEA community. Use of RPN is predicated on the assumption that detection risk is sufficiently important that it needs to be addressed in FMEAs, and characterized in the risk priority number.

The argument in favor of including detection risk within FMEA risk prioritization goes something like this. Detection-type controls should be able to detect the failure mode/cause in order to ensure problems are not discovered by users; especially for higher-risk issues. Where detection-type controls are not able to detect the failure mode/cause, there is a potential for detection risk. FMEA teams can improve detection-type controls through the recommended action’s column, in order to reduce detection risk and ensure anticipated problems are discovered through testing and analysis during product development.

The argument against including detection risk within FMEA risk prioritization focuses on the classical definition of risk. The classical characterization of risk from ISO standards says, “Risk is often expressed in terms of a combination of the consequences of an event and the associated likelihood of occurrence.” The argument continues with the limitations to detection scales published by AIAG or SAE. Many consider the detection scales to be confusing and difficult to apply. Therefore, according to this argument, the use of “criticality” (SxO) is a better way to identify risk.

If RPN will be used, the company needs to understand the limitations of RPN and detection risk and act accordingly. For example, high severity must always be addressed regardless of RPN value. The company will need to develop a detection scale that makes sense, and detection-scale criteria verbiage that is clear and represents the varying risk due to likelihood of detection of failure modes and associated causes.

If “criticality” (SxO) is used, the company needs to understand the risk from lack of detection of failure modes/causes during product development, and find a way to address this risk. It is not acceptable for the end user to be the one to discover problems that were missed by testing, during the product development timeframe.

Can you take into account reliability or durability functions in an FMEA? How can this be done? A reader asks this question, and it is discussed and answered in the next FMEA Q and A article.

About Carl S. Carlson

Carl S. Carlson is a consultant and instructor in the areas of FMEA, reliability program planning and other reliability engineering disciplines, supporting over one hundred clients from a wide cross-section of industries. He has 35 years of experience in reliability testing, engineering, and management positions, including senior consultant with ReliaSoft Corporation, and senior manager for the Advanced Reliability Group at General Motors.

« Reliability Benefits from Product Support

The Good DFMEA The Bad DFMEA and Ugly outcomes »

Comments

apisak sri-amorntham says
March 29, 2020 at 9:28 PM
i have seen a website describing RPN by considering only Consequences and it refers this website.
what is your advice if we use only consequences to define criticality?
https://www.fiixsoftware.com/blog/criticality-analysis-what-is-it-and-how-is-it-done/
Reply
- Carl Carlson says
  March 30, 2020 at 11:40 AM
  Hello, and thanks for your question.
  The article you reference is written for maintenance applications. My initial comment is that Maintenance FMEAs are not the same as Design or Process FMEAs. Maintenance FMEAs share many of the fundamentals with Design and Process FMEAs, but there are significant differences. I cover the similarities and differences in chapter 15 of my book, Effective FMEAs.
  The author discusses two ways to carry out a criticality analysis.
  The first approach the author mentions is a visual grid approach where severity of a given consequence (on the X axis) is plotted against the probability of that consequence occurring (Y axis). This approach is used in many FMEA applications and is a good way to visually show risk on a matrix diagram. Two comments: FMEA does not calculate the probability of a consequence occurring. To get the probability of harm, it usually takes a Fault Tree Analysis or Probabilistic Risk Assessment. And, if you only focus on severity and occurrence, there is still risk related to detection.
  The second approach the author mentions is to separate the consequence categories by type (for example, health and safety, environmental, and operational), assess the severity of each type, and multiply the resulting ratings together. My personal view is this approach does not take into account the occurrence risk nor the detection risk, so it may be useful to prioritize equipment, but may miss higher risk issues inside the FMEA.
  As you can see from my article that you reference, I have concerns about RPN. Fred Schenkelberg and I podcasted on this subject: SOR 499 “Is There a Better Way than RPN?” https://accendoreliability.com/podcast/sor/sor-499-better-way-rpn/
  Let me know if this answers your question.
  Carl
  Reply
Thomas says
May 27, 2020 at 10:21 PM
Thanks Carl for this very informative article. From my personal perspective RPN taking in account the detectability is not very useful to assess the importance of a risk. Detectability can easy change but identifying the right indicators or combination of indicators. I would weight the impact and the probability much higher.
Reply
- Carl Carlson says
  May 28, 2020 at 5:30 AM
  Hello Thomas,
  I understand your comments about the efficacy of detection risk in prioritizing risk within an FMEA. I’ll begin my reply with two additional references and then make a couple of comments.
  Reference my article “Understanding how to prioritize risk for corrective actions in an FMEA,” specifically the portion at the end of the article about Action Priority Table.
  https://accendoreliability.com/understanding-prioritize-risk-corrective-actions-fmea/
  Reference also the podcast “Is there a better way than RPN?”
  https://accendoreliability.com/podcast/sor/sor-499-better-way-rpn/
  Both the referenced article and podcast are published since the article “Understanding RPN Limitations – Problems and Solutions,” and offer more perspective.
  My personal view is I prefer the approach that uses an Action Priority table, rather than RPN. This methods allows a company to weight severity, occurrence and detection in any way that makes sense. In your case, you may wish to de-emphasize detection risk. I would not suggest eliminating the use of detection rating, as it is important to be able to detect an issue (failure mode and associated cause) before the problem gets to the user/customer. I typically place more emphasis on severity and occurrence, but still consider detection risk.
  Thanks again for your comments. Hope this added information is helpful.
  Carl
  Reply
Chris says
October 25, 2020 at 11:03 AM
Hi Carl,
Considering different milestones of product development, e.g., NPI > NPD > design frozen > engineering build > mass production.
Since FMEA is a live document, a new risk that is highlighted during engineering build will have very high detection rating such as 8. Let’s say RPN = S*O*D = 7*3*8 = 168.
Despite being able to complete the testing quickly, this detection rating will still remain as 8. This risk, if it had been highlighted and tested by design frozen, will have much lower detection rating.
The essence of my question is, detection rating seems to be very dependent on project time line, despite pointing to the same risk. And there no way to lower the detection rating (unlike making design change to reduce severity or likelihood).
Could you kindly share your thoughts in this?
Reply