Human Factors in Reliability Centred Maintenance

In Part 2 of Beyond the Numbers, I explored how human reliability principles can be integrated into traditional reliability artefacts such as Reliability Block Diagrams (RBDs), Fault Tree Analysis (FTA) and Failure Modes, Effects and Criticality Analysis (FMECA). Those tools help us understand how systems fail and how different failure paths interact.

But reliability engineering does not end with failure analysis.

In practice, the outputs of FMECA flow directly into the Reliability Centred Maintenance (RCM) process, where failure modes are translated into maintenance strategies such as inspections, condition monitoring, restorations or replacements.

If assumptions about human performance are simplified during reliability analysis, those assumptions do not disappear. They are carried forward, and often amplified, when maintenance requirements are defined.

Maintenance is where reliability modelling becomes an operational commitment.

Human Error Within Established RCM Thinking

Established RCM methodology recognises that human actions can contribute to system failure. If a specific human action is considered a credible reason why a functional failure could occur, it can legitimately appear as a failure mode within the analysis.

At the same time, RCM makes an important distinction. Where errors arise because of poor access, physical limitations, unclear interfaces or other design constraints, the underlying cause is often a system weakness rather than individual behaviour. In these cases, the human action may be part of the failure sequence rather than the root cause itself.

This distinction is significant because it recognises that:

Not all human failures are behavioural
Some are design-induced
Some arise from latent system conditions

RCM acknowledges these principles, but its primary purpose is to guide maintenance decision logic rather than to provide a detailed framework for analysing human performance variability.

Human Factors builds on that foundation by examining how task design, workload and operational context influence whether those assumptions hold true in practice.

RCM as the Bridge Between Reliability and Maintenance

RCM provides the structured bridge between failure analysis and maintenance policy. Its familiar logic asks questions such as:

What are the functions and functional failures of the system?
What causes each failure?
What happens when the failure occurs?
What are the consequences?
What can be done to predict or prevent it?

The first stages of this process closely align with the FMECA activities discussed in Part 2. Once failure consequences are understood, the RCM logic moves toward selecting maintenance strategies.

However, each step in the process implicitly contains assumptions about human performance.

When selecting a condition-based task, for example, we implicitly assume that:

Degradation will be detectable
The detection method will be applied correctly
The task will be performed at the intended interval
Results will be interpreted accurately

None of these are purely technical assumptions.

If human performance is treated generically during failure analysis, it is likely to remain generic when maintenance policies are defined. The result may be a technically sound RCM output that rests on optimistic execution assumptions.

One way to visualise where these assumptions appear is to look at the RCM decision logic itself.

Simplified RCM decision logic with Human Factors considerations (adapted from Maintenance & reliability Best Practices, Ramesh Gulati)

The diagram above highlights that many RCM decisions depend on human performance conditions that are rarely examined explicitly. Detectability, recovery actions and task execution are not purely technical properties, they depend on how operators and maintainers interact with the system under real operating conditions.

Human Factors does not replace the RCM decision process. Instead, it provides additional questions that help ensure those assumptions are realistic.

Maintenance as a Human-Dependent Risk Control

Maintenance tasks derived from RCM are intended to act as risk controls. These may include:

Scheduled inspections
Functional tests
Condition monitoring activities
Planned restorations or replacements

Each task is intended to reduce either the probability or the consequence of failure.

However, each task ultimately depends on human execution under operational conditions.

Two clarifications are important.

Task Existence Does Not Equal Task Effectiveness

The presence of a maintenance task in a maintenance plan does not guarantee that it reliably controls risk.

A task may exist on paper but be vulnerable in practice if it is:

Difficult to access
Time-pressured
Ambiguous in its diagnostic criteria
Dependent on high levels of judgement
Competing with other tasks during maintenance windows

Human Factors helps reveal where these conditions may reduce task effectiveness.

Maintenance Can Introduce Failure

Reliability discussions often focus on preventing failure. Less attention is sometimes given to the fact that maintenance activities themselves can introduce new failure modes.

Examples include:

Incorrect reassembly
Configuration errors
Disturbance of latent defects
Misinterpretation of condition monitoring data
Incomplete reinstatement following maintenance

Recognising these possibilities does not undermine maintenance policy, it simply acknowledges that maintenance activities must themselves be designed for reliability.

Detection, Degradation and the P–F Interval

Many RCM decisions rely on the assumption that degradation can be detected before functional failure occurs.

The well-known P–F curve illustrates the interval between the first detectable indication of degradation and the point at which functional failure occurs. Predictive maintenance techniques aim to detect deterioration within this interval so that corrective action can be taken before failure.

However, detection is not purely a property of sensors or monitoring technologies.

The effectiveness of predictive maintenance also depends on how people interact with those systems, and how signals are measured, interpreted and acted upon.

P–F curve illustrating Human Factors influences on degradation, detection and response

The diagram above highlights several stages where Human Factors influence the effectiveness of condition monitoring.

Detection may depend on:

Consistent measurement practices
Correct sensor placement
Reliable data capture

Once signals are detected, operators or analysts must interpret them correctly and recognise emerging trends. In environments where many alerts are generated, factors such as alarm fatigue can reduce the likelihood that meaningful signals are recognised and acted upon promptly.

Finally, even when degradation is recognised, corrective action still depends on timely decision-making and operational priorities.

The existence of detection technology therefore does not guarantee effective predictive maintenance. Human interpretation and organisational response remain critical links in the chain between degradation and intervention.

Human Factors Across the RCM Process

While Human Factors influence individual RCM decisions, they also shape the broader process through which maintenance policy is developed.

From defining system functions to selecting default actions, assumptions about human behaviour and organisational conditions influence the outcomes of the RCM analysis.

RCM process flow with Human Factors inputs

The diagram above illustrates how Human Factors considerations arise throughout the RCM process.

Examples include:

How operators actually use equipment during normal operations
Whether failures are detectable by operators or monitoring systems
Whether recovery actions are realistic under stress
Whether maintenance tasks can be performed consistently in practice

These considerations do not alter the RCM methodology itself. Rather, they provide additional context that helps ensure the resulting maintenance policies reflect operational reality.

What This Means for Reliability Practitioners

Integrating Human Factors into RCM:

• Does not replace the methodology
• Does not undermine its logic
• Does not complicate the process unnecessarily

Instead, it improves the realism of maintenance policy by examining the assumptions that sit beneath maintenance decisions.

Human Factors helps reliability practitioners ask questions such as:

Can this task be performed reliably under real conditions?
How sensitive is detection to interpretation or workload?
Could maintenance activities themselves introduce new risks?

In this sense, RCM answers the question:

“What maintenance should be performed to control failure risk?”

Human Factors complements this by asking:

“Can that maintenance be performed reliably by real people in real conditions?”

Together, these perspectives strengthen the link between reliability analysis and the reality of maintenance execution.

Looking Ahead

If RCM defines what maintenance should be performed, the next question is how those tasks are structured and executed in practice.

In Part 4 of this series, we will move from maintenance policy to maintenance task design, exploring how Human Factors influence job structure, cognitive demand and the reliability of maintenance execution.

Later in the series, we will also examine how organisations interpret human failure in service, and how root cause analysis can either reinforce hidden vulnerabilities or genuinely resolve them.