Common mode or common cause failures related to redundant systems where one cause can lead to the failure of otherwise redundant elements leading to system failure.
Elements which should fail independently are under some circumstances dependent.
When considering the probability of individual paths in a complex redundant system, take due care to consider the common mode failures which may have a higher probability than any single path in the system.
Let’s consider a couple of examples to illustrate.
Change Over System
In a warm or cold standby system, such as a power backup generator, the change over the system has to sense the failure of the primary power and energize the starting sequence for the generator. If the sensor or starting system becomes inoperable due to the same event that caused the main power to fail (power surge tripping protective breakers, for example), the backup system will fail to come online.
If the sensor or starting system becomes inoperable due to the same event that caused the main power to fail (power surge tripping protective breakers, for example), the backup system will fail to come online.
Indicator Failure
Multiple sensors may watch a pressure vessel for dangerous increases in pressure build up and trip alarms to warn plant operators to take corrective action. If the operator is inattentive or otherwise unaware of the alarm (ignored due to many false alarms, for example), then the pressure build up may lead to tank rupture due to multiple redundant sensor and alarm systems simply being ineffective.
If the operator is inattentive or otherwise unaware of the alarm (ignored due to many false alarms, for example), then the pressure build up may lead to tank rupture due to multiple redundant sensor and alarm systems simply being ineffective.
Repeated Errors
Let’s say a maintenance action to check and adjust a truck’s brakes has an error. Maybe the mechanic is using a procedure for an older model that misses one step for the brake systems on the newer systems.
Even though there may be multiple redundant brakes on the vehicle if due to a common misstep in the maintenance of the systems, they can all fail.
Common Paths
Some medical devices have surgical probes or sensors with electronics, fiber optics, and mechanical controls routed through a single bundled cable.
The multiple functions of the system use different technologies, yet are collocated in a vulnerable structure. The bending or kinking of the bundle may be the single cause of fracture for copper wire and fiber optic cables. Another example is the routing of power and phone lines in one conduit, such that a single action that severs the bundle causes both systems to fail.
When analyzing redundant systems consider the many sources of common mode failures and include them in the analysis.
Human error, maintenance processes, and accidents are frequent common mode failure causes.
Related:
Common Cause Failures (article)
Fault Tolerance Basics (article)
Benefits of Fault Tree Analysis (article)
Mark says
Fred,
Thanks for the informative post.
I see this kind of misstep by reliability engineers far too often. The reliability of a particular component is too low causing the estimated system reliability to be below goal. The component is relatively cheap so the recommendation is to simply add another identical component to the system. Consider the new component as redundant to the original and suddenly the estimated system reliability is acceptable.
As you indicate in your post, redundancy is failure mode specific. That fact complicates system models but it has to be considered.
Mark
Fred Schenkelberg says
Hi Mark,
thanks for the comment and example – you’re right, it can become a pain to model, yet avoiding the line of reasoning may result in unwanted system failure.
Cheers,
Fred
Stuart Walker says
A question of semantics but according to ISO 12100, common cause and common mode are not the same thing, and should not be confused.
It defines common cause as failures of different items resulting from a single event, where these failures are not consequences of each other.
It defines common mode as failures of items characterised by the same fault mode, but the causes could be different.
Fred Schenkelberg says
Thanks Stuart I didn’t know that. thanks for the reference too.
Cheers,
Fred