- What is Maintenance?
- An Historical Perspective of Maintenance
- Equipment Reliability
- Reliability and Maintenance Strategy
- The Ongoing Evolution
- Effective Maintenance – DuPont Maintenance Study Results
- Making the Transition from Reactive to Strategic
- Discussion
1. What is Maintenance?
A definition: Maintenance describes the management, control, execution and quality of those activities which will reasonably ensure that design levels of availability and performance of assets are achieved in order to meet business objectives.
If maintenance expenditure is viewed as the necessary premium to be paid for reliability insurance, then it follows that all maintenance activity should be directed towards maximum returns on that investment, i.e. improved reliability. Rarely is that found to be the focus. Usually the emphasis is on returning the machine to service as quickly as possible without any serious consideration of reliability improvement while the opportunity is presented.
- Maintenance is a Risk Control activity
- Risk = Consequence x Probability = Consequence x (Opportunity x Chance)
The expenditure of maintenance dollars on risk management (e.g. condition monitoring, process control, etc) should be directly related to the probability and consequences of failure. Often reasonable judgements based on experience can be made without the rigour and expense of exhaustive failure modes analysis. Sometimes, however, a formal risk assessment must be made and decisions made based on those outcomes.
- Core maintenance activities are defined by design and process.
- Additional maintenance activity results from premature equipment failure.
- Unexpected failures may incur other costs or losses – such as lost production, diversion of planned maintenance resources, loss of reputation, penalties for late delivery, etc.. These are usually very much greater than the actual repair costs of the failure
The focus of Maintenance should be upon maintaining the ‘wellbeing’ of the plant – when the task is to ‘fix it’ then Maintenance has failed in its basic mission, quite possibly through no fault of its own.
2. An Historical Perspective of Maintenance
Much has happened in engineering since the industrial revolution a couple of hundred years ago, but perhaps the most dramatic changes have occurred in the last fifty years. These changes have of course affected how industry’s plant has been maintained.
Prior to the Second World War machinery was generally quite rugged and relatively slow running; instrumentation and control systems were very basic. The demands of production were not overly severe so that downtime was not usually a critical issue and it was adequate to maintain on a breakdown basis. This machinery was inherently reliable. Even today we can see examples of machines made in that period which have worked very hard and are still essentially as good as the day they were made.
From the 1950’s with the rebuilding of industry after the war, particularly those of Japan and Germany, there developed a much more competitive marketplace; there was increasing intolerance of downtime. The cost of labour became increasingly significant leading to more and more mechanisation and automation. Machinery was of lighter construction and ran at higher speeds. They wore out more rapidly and were seen as less reliable, perhaps it was too that they were utilised more fully. Production demanded better maintenance which lead to the development of Planned Preventative Maintenance.
It was recognised that at a level of failure of, say, 10 machines in 100, the probability of failure had become unacceptably high and the full group of machines should be overhauled. However, there may be a significant loss in potential life in the remaining group of machines, but in view of the risk this was considered justified. The planning involved plant overhauls based upon a time interval or usage at which the failure rate of a group of similar machines became unacceptable. This lead to the basic assumption that the older equipment gets the more likely it is to fail. This was the age of the “Bathtub Curve“.
There are three identifiable phases within the Bathtub Curve:
- Running In, also known as the Infant Mortality, phase. This recognises the premature failure of components and is often seen in plant in the first few days or weeks after overhaul.
- Normal Operating Life phase. This shows a relatively constant probability of failure. Failures within this phase are usually referred to as Random.
- Wear Out phase. There is an increasing probability of component failure between equal and successive time intervals. Somewhere within this phase the failure rate would become unacceptable and widespread maintenance would be carried out, usually of an intrusive nature, on equipment still in its “normal operating life”. This is akin to carrying out open heart surgery on healthy machines.
This intrusive scheduled maintenance would lead back to the beginning of a new bathtub curve – the phase of Infant Mortality with its increased probability of failure.
Effectively, the process of Planned Preventative Maintenance gave rise to a series of mini-bathtub curves, each with their initial period of increased high risk
Figure 3 – Every Maintenance Intrusion Provides Probability for Early Life Failure
Within Planned Preventative Maintenance the name of the game became one of choosing the best point in the Wear Out phase at which to perform maintenance, all other factors considered.
3. Equipment Reliability
In the 1960’s with the introduction of the Boeing 747, the aviation industry, in its search for improved reliability, questioned the then current maintenance strategies and the long established basic assumption that the older equipment gets the more likely it is to fail.
At that time aviation accident rates were in the order of 60 per million takeoffs. 20,000 hours flying time required some 2,000,000 man-hours of maintenance, performed on a time, or hours run, basis. This basic assumption was questioned and the failure process researched. Six patterns of failure were identified, and of these three showed a relationship of increased probability of failure with age, but they totalled only 11% of failures.
The remaining 89% showed no age relationship, but an open ended period of constant probability of failure. In other words, failure is a random event. They do, however, have the potential to give warning of a developing failure through changing levels of a suitable measurement parameter, indicating a change in condition of the component, machine or system. To look for and find these warning signs is the only way to identify a need for maintenance under conditions of a constant probability of failure.
The aviation industry made major changes to its maintenance practices as a result of this study and the results were dramatic; maintenance man-hours for 20,000 hours flying time went from 2,000,000 down to 66,000 – a 30:1 reduction. If the flying public were aware of such reductions in maintenance there would probably be a similar drop in public confidence. There was an equally dramatic improvement in safety – effectively reliability. Of course much of this improvement is due to design improvements and technology but condition based maintenance techniques are the provider of considerable information to assist in this development.
Industry is not an airline; over subsequent years industry has followed this study and found that there is a high degree of validity in it. This study is one of the principal foundations of Reliability Centered Maintenance.
4. Reliability and Maintenance Strategy
In choosing the parameters to be measured to identify changes in condition, it is necessary to consider how the machine or system might fail. Take as an example a simple pump set.
Can the Failure Process be Detected and Measured?
Having identified the possible ways in which a machine may fail, consider if it is possible to detect and measure the failure process. Reflect on previous failure history and use ‘hindsight’ knowledge. If the answer is NO and the failure process is not detectable then use either Planned Preventative or Breakdown Maintenance, depending upon the Criticality or Risk should the failure event happen. If the answer is YES, the failure process can be observed, and the Criticality justifies it, then Condition Based Maintenance will be applied. If the answer is YES but Criticality does not justify it, then Planned Preventative or Breakdown Maintenance will be applied.
The approach thus far requires that every item of plant (system, machine, component) be reviewed, criticality considered, and a decision made on the maintenance it will get – repair by Replacement, Scheduled, or Condition Based.
5. Ongoing Asset Maintenance Evolution
From the 1980’s plant and systems became increasingly complex, the demands of the competitive marketplace and intolerance of downtime increased, and maintenance costs continued to rise. Along with the demands for greater reliability at a lower cost came new awareness of failure processes, improved management techniques and new technologies to allow an understanding of machine and component health. The study of Risk has become very important. Environmental and safety issues have become paramount. New concepts have emerged; condition monitoring, just in time manufacturing, quality standards, expert systems, reliability centred maintenance to name but a few.
Engineering, and maintenance with it, are subject to the whims of fashion – “value engineering, hazard and operations studies, project task force teams, World Class, CMMS, CAD, TPM, TQM etc”. We have seen the development of “Centres of Excellence” from such major players as Shell, ICI, DuPont, UKAEA, etc, where reliability specialists were employed to advise, analyse, troubleshoot etc, and advocate on economic justification for increased expenditure to gain in reliability and availability against pressure of capital expenditure.
There is the thrust toward acceptance of life cycle costs which recognises that the design & build of a plant must be lumped in with the ongoing maintenance cost and the eventual cost of decommissioning and disposal. Manufacturing and production enterprises are under intense pressure to achieve maximum efficiency. The winners will be seen to be – so we are told, those that maximise their investment in people and equipment assets to achieve highest profitability.
In the United Kingdom the mid-90’s saw the creation of The Institute of Asset Management. Asset Management is currently receiving the full attention of most organisations with the creation of new departments dedicated to its implementation – no doubt there will be a period of exploration and evolution as it develops and becomes understood. It will provide a means of integrating the many seemingly unrelated parts into a whole that will provide for moving into a cohesive strategic model.
6. Effective Maintenance Programmes – DuPont Study Results
In the mid-1980’s Du Pont Corporation carried out a study of the effectiveness of the maintenance operations in their large number of plants. They identified the characteristics of these operations and found the pattern shown below. Within industry and maintenance consulting there has been an unquestioned acceptance of the universal truth of this model. But the experience of companies applying the model has led to the realisation that it is only a model, and in reality there are many factors not shown that greatly impact the successful progression up the ‘stairway to heaven’.
It appears reliability is an evolutionary process, as indeed the history of maintenance since the 1950’s has been. It may well continue to be necessary for companies to learn how to become reliable instead of transplanting improvement methods and tool into the company as suggested by the DuPont model. Companies have to discover the “essential truth” of reliability and carry that knowledge and know-how it forward in their engineering, maintenance and operations practices.
Companies with truly effective maintenance and high reliability operations are few and far between.
Take a moment to consider where, in terms of this model, might your organisation fit. Is this uniform throughout your site? Are some areas more advanced than others? Why?
Many organisations today are in, or coming into, the ‘Planned’ phase with some of the components of ‘Reliability’ either in use or being put into place. Du Pont additionally found that in the move from ‘Reactive’ to ‘Planned’ the value gained when all three strategies were integrated or coordinated together was greater than the sum of the parts – in other words, doing predictive and preventative maintenance is most successful in lifting reliability when they are planned and scheduled. In many organisation the Predictive, or Condition Monitoring, component is still not well integrated.
7. Making the Transition from Reactive to Strategic
The essential elements involved in the process of change and development of this maintenance model are more knowledge, higher skills, wider perspective, seeking excellence, improving system quality and integration of systems.
7.1 More Knowledge
Some refer to the era we are in as “the Knowledge Age”. With the ready availability of the internet knowledge has become increasingly available and within our own organisations company intra-nets are contributing to this.
AS/ISO 9001 requires, amongst other things, that people are competent, not just trained. By implication this means that people must understand what it is they are doing – “monkey see, monkey do” is no longer acceptable, and to understand requires underlying knowledge.
Some companies are now developing competencies for the operators for “Fitness to Operate” and for their maintainers “Fitness to Maintain”. These will require, for example, that maintainers have a knowledge of the processes that the equipment they care for is working in. Similarly, Operators will have to have a knowledge of the maintenance requirements and strategies of the equipment they use in their processes.
No longer is the philosophy “trained once, trained for life” acceptable, as it used to be for someone having completed an apprenticeship, for example. In quite recent times there has been experience of managers not wanting their people to be given additional knowledge – it was seen as a dangerous thing, they might start thinking for themselves!
One of the findings of the Esso-Longford disaster was the lack of knowledge and understanding by operators of the process.
7.2 Higher Skills and Seeking Excellence
The application of higher skills and the thrust for excellence, as seen in Precision Maintenance, is acknowledged as critical in the elimination of equipment failures and in the move toward reliability with extended MTBF, resulting in higher availability.
The process of moving Toward Improved Plant Reliability through Precision Skills requires a significant change in attitude and thinking at all levels in the maintenance organisation.
Additionally, there must be an appreciation of the assets within the maintenance organisation that represent significant value
- good engineering & maintenance experience
- staff’s accumulated plant knowledge
- proven technologies, systems & services
7.3 A Wider Perspective
The wider perspective looks at the greater picture of an operation, the interaction of the component parts to give a greater joint benefit. This is well represented by the concept of Asset Management, given in the Introduction.
7.4 Improved Quality Systems
This simply implies the use of ISO 9001framework and the need for ongoing review of what maintenance and operational practices and processes impact reliability, followed by taking appropriate action for non-conformances or introducing improvements.
7.5 Integration of Systems
When business systems are integrated there is a freedom of movement of information between applications which provides for a unity in approach and prevents equipment items ‘falling through the cracks’. With this wider perspective and strategic vision high reliability becomes a possibility. In particular the application of quality management permits one uniform system throughout the operation and consequently the company becomes much more effective in achieving its aims.
8. Discussion
- Where do you see your company is now at in its reliability journey?
- What elements do you see must next be introduced to progress up the model?
- What is missing from the model that you need to have to reach the next level of performance?
- How does the next phase affect you and your people?
Our best regards to you,
Peter Brown and Mike Sondalini
Lifetime Reliability Solutions HQ
Leave a Reply