Fault tree analysis (FTA) is used to establish a potential chain or path of equipment failures using Boolean logic to generate a graphical relationship of events leading to part or system failure. It is a deductive approach that is useful for different systems or facilities at the product design or operational stages. FTA fosters reliability of systems by:
- enabling engineers or maintenance teams to understand how failures happen
- identifying suitable alternatives for minimizing risks
- estimating safety levels in the event of failures
As a standard top-down failure analysis model, FTA uses standard classifications, symbols and structures. A combination of events and logic gates creates a fault tree that highlights the dependencies between events. The goal of FTA is to minimize system vulnerabilities, lower the probability of failure and address the prevalent causes of failures.
Fault tree analysis adopts similar procedures across industries. Whether in the chemical processing industry, software engineering or healthcare, a team of multi-skilled experts has to be in place. They identify the causes of process or equipment failure, explore equipment to understand their operating standards and generate fault tree diagrams. Fault trees enable them to pinpoint common causes of failure (CCF), minimal path sets (MPS) and minimal cut sets (MCS). They then evaluate the possible mitigation measures and incorporate them in operations and maintenance (O&M) strategies.
Using CMMS to collect relevant data for FTA
Sufficient data is required for conducting a conclusive FTA and guaranteeing the reliability of systems. The adoption of computerized maintenance management systems (CMMS) continues to transform maintenance operations. When combined with industrial automation and condition monitoring techniques, it turns equipment and processes into reliable, efficient and autonomous units. CMMS programs enable technicians to collect, store and analyze large amounts of data faster and more accurately.
For facilities with a blend of production equipment from different manufacturers, the operating procedures, maintenance schedules, spare parts and tool requirements vary. Relying on manual data collection and record-keeping for such complex facilities increases the tendency of confusion among the technicians. Technicians end up deferring part of maintenance work or using the wrong replacement parts. In the long run, they cause errors that culminate in interdependent events that ultimately lead to failure.
Through a CMMS program, an organization can group and categorize its production assets and processes based on their criticality, type or manufacturer. The program will retain a library of manuals, appropriate spare parts and maintenance histories. It captures the preventive or corrective maintenance measures performed on each asset over its operating life.
To conduct FTA, the reliability engineers and technicians infer on the maintenance logs to evaluate the effectiveness of existing maintenance strategies. From the maintenance history, technicians establish the type and frequency of system failures and begin working backward to identify the chain of events that are causing them.
Clustering and prioritizing production assets simplifies performance analysis of production assets from different manufacturers. Technicians get an insight into the failure rates of specific parts and the longevity of the recommended replacement parts. If a particular equipment part or brand is prone to failure, then the probability that it is a manufacturer defect is high. It simplifies fault analysis and offers the responsible teams ample time to explore the mitigation measures.
The capability to host CMMS programs across portable devices in different locations enhances collaboration and communication between maintenance teams. Through this shared platform, they gain unlimited remote access to machine-specific maintenance records. A technician can access the records of a failed asset and consult with others at different geographical locations to identify the cause and mode of a failure. It reduces the costs required for assembling a physically present team for FTA.
Tying CMMS programs to condition monitoring sensors simplifies the data collection process. The sensors capture machine operating conditions in real-time, displaying them as a series of signals on the program’s dashboard. Signal variations indicate the beginning of a failure. With this information, the technicians can deduce the actual conditions that initiate a failure and the propagation method leading to the stoppage of operations.
Summing up
CMMS programs don’t just simplify routine maintenance operations. They can be a starting point for conducting fault tree analysis for different systems. The computing power of the devices hosting the CMMS programs speeds up the decision-making process as they allow prompt and accurate access to relevant data. In return, a proper FTA enhances reliability and informs the development of safe systems regardless of their complexities.
Larry George says
Thanks for your article connecting CMMS and FTA. FTA also computes the rates of top events given rates for basic events, which have to come from somewhere. FTA also can deal with dependence among basic events of the type P[Basic event] = P[Stress > strength], including situations such as dependent stresses (e.g. earthquakes) and strengths (common cause, same manufacture, load sharing, etc.).