The Concept of Equipment Redundancy
Adding equipment redundancy to a system can improve uptime and reliability leading to increased output. When adding new equipment, it is cheaper to evaluate the benefits, or lack of thereof, on paper before implementing the change. Typically done as part of the design phase. However it can also happen after commissioning but its is more expensive. In other words, its better to get it right before “shovels go in the ground”.
A system is a collection of items that operate together to produce an output, often a production value. Let us consider a system comprised of 2 pumps. We want to add a third identical pump and evaluate the benefits. As illustrated in Diagram 1 below. The system requires 2 pumps to run simultaneously. If any System 2 pump fails, the spare or “hot standby” pump activates immediately. Thus avoiding a system shutdown.
The redundancy introduced can contribute to many improvements listed as follows:
- Improved reliability: reduction of the probability that the entire system shuts down if one pump fails. System 1 would fail if any one pump failed.
- Improved availability: increased uptime based on the same concept as above.
- Improved maintainability. Maintenance tasks typically require equipment lockouts mainly for safety reasons.
- With System 2, the maintenance of a pump will not require a system shut down. For System 1, a shutdown is inevitable.In case of an unplanned System 1 pump failure, maintenance crews have to work under stressful conditions. They have to rush and get repairs done as production is likely impacted.
- With a spare third pump in System 2, more time could be allocated to the preventive maintenance or repair tasks. This leads less potential for errors or “maintenance induced failures”.
An operator would be tempted to run System 1 to the limit. That is for longer than it should be due to production constraints. This often leads to unplanned failures that can erase all the expected benefits. Akin to “getting caught with one’s pumps down”.
RAM Model Example
Adding redundancy can come at great cost. This requires additional equipment and installation expenses. Post commissioning, a fair amount of “re-jigging” work will be required. This increases costs even more. Therefore, the investment has to be justified. A RAM model is the tool of choice to evaluate the economics of redundancy.
The fundamental purpose of Reliability, Availability, and Maintainability (RAM) modeling is to quantify system performance, typically in a future time interval. Building a RAM model requires inputs such as equipment performance records, operating philosophy, operating costs, and the desired production throughput. The model runs for a future time interval. For example, the next five years. It provides valuable information to the operator and helps with decision making. This information can include the total cost of operating the system, production losses, spare parts usage, as well as the impacts of weak links or bottlenecks in the system.
In what follows, we build a simplified RAM model based on System 1 and 2 above mentioned. The following pump and system characteristics are used.
- Pump failure distribution: Weibull 2 Parameter with Shape Parameter Beta = 1.25 and Scale Parameter Eta = 20,000 hours
- Pump repair distribution: Triangular with Min = 2 hours, Mode = 8 hours and Max = 16 hours
- Input flow: 1,000 units per hour
- Maintenance cost: $1,000 per repair
System 1 and 2 RAM simulations run for 5 years. The results are provided in Diagram 2 below. Adding a third pump reduces the number of system failures completely. From 4.26 for System 1 to zero for System 2. And logically Reliability and Availability both jump to 100%.
Is the pump investment financially justified? It depends…
In this example, justifying the redundant pump investment would depend on the net incremental revenue. In other words, does the incremental revenue pay for the new pump? The net incremental revenue is the incremental output minus the incremental operating cost. Let’s assume the following revenue and investment costs.
- 1 unit of production generates a revenue of 1$
- A new pump purchased and installed costs $100,000
In Diagram 2 above, we generate an incremental amount of 37,583 units over 5 years by adding a third pump. The gross revenue is $37,583. Minus an incremental maintenance cost of $150, the net revenue is $37,433.
Therefore, the investment into a new $100,000 pump is not justified based on 5 years of operation. Running the RAM model over multiple years, provides us with the break even point. That is the number of years after which the pump investment is justified. As illustrated in Diagram 3 below.
Therefore, operating System 2 beyond 13 years justifies the extra pump investment. Note that more information put into the model refines the calculation further. Additionally, the time value of money (or Life Cycle Cost) is not considered in this analysis. Nevertheless, the RAM model is highlighted as the correct tool to make the final investment decision.
Abdulrahman Alkhowaiter says
Great article subject, well written and logical discussion on this critical subject of Redundudancy in machinery. Its definately an important issue due to its impact on capital costs of new industrial facilities, and its impact on lifetime maintenance costs, plus the impact on plant availability which you have focussed on.
Would like to add that the article focussed on plant Availability however it did not look at the impact of Redundant machinery on the Reliability of the individual sister machines. In majority of cases, there is a net Reduction in Equipment Reliability by having standby redundant machinery.
This article touches on the subject: https://www.linkedin.com/posts/abdulrahman-alkhowaiter-a7865318_rotating-equipment-switch-over-frequency-activity-7194981570515025920-gYC5?utm_source=share&utm_medium=member_desktop
André-Michel Ferrari says
Thanks Abdulrahman for your kind feedback on my paper. The second point you raise is an interesting one. From what I understand you question the reliability of standby pumps. You also reflect on the methodology of how to make sure those standby pumps are actually “available” when we need them.
I strongly believe that standby pumps improve the reliability of a system. We have to look at it from a systemic point of view. If the sister pump fails, they will take over this avoiding a system shutdown. HOWEVER they need to be MAINTAINED correctly. And by correctly, I mean that we need to reflect carefully on, research and implement the BEST Maintenance strategy that avoids unplanned failure or “on demand” failures.
NOW. Typically, we base reliability of pumps on the run time – sometimes other variables like cycle time. The more they run, the more they will be prone to deterioration. So failure modes are directly linked to run time. The point you raise is key. What if the failure modes were related to something else? Like being down itself and kind of degrading from idleness for the lack of a better word. The latter should then be taken into account for those pumps as yes, this could affect reliability and availability. In other words, we need to study the effects of idleness.
The other point I want to add is the following. If we do a robust life analysis study for our pumps, then we should know with relative certainty when the running pumps are going to fail. If we know they are going to fail then we have a preventive repair plan in the works. In this case, before the running pump is shutdown, we can have a maintenance plan that activates or tests the standby pump in order to ensure that it does not fail when needed. In addition to the above we would apply the learning of “static deterioration” to the same maintenance strategy.
To cut a long story short, your point is extremely valid. However you might want to consider other approaches to solving it. Having said this, if there is ever a study done on the above, I’d be more than happy to collaborate.
Thanks Abdulrahman again for your valuable question and by all means keep them coming!
Abdulrahman Alkhowaiter says
Dear Andre,
Sorry just found your response. Please note that yes, your understanding of the point raised is clear…in fact early on in my work history it never came to mind that the reliability of a single machine is higher when operating at N status, then when operating at N+1 for example.
All machinery degrade while on standby mode, or during switchover modes. This was explained well in my article on switchover, and that means that we as humans think we are smart, but we introduced cold standby systems that led to higher availablity for the redundant pump system, but lower reliability for each identical pump.
Now that you have an indepth understanding of these variables, i would suggest that you rewrite your excellent article, but then expand it to include reliability degradation that occurs from having standby machinery.
The Nuclear Industry is stuck with redundant machinery that not only incur high maintenance but also lowered reliability and they would love to see and learn from your final published report in their reliability exchange conferences.
You will do their industry a major favor by teaching them about their weakness and developing a reliable solution to reduce time related reliability detreioration in redundant machinery systems. Another point is that they have highly qualified engineers who will instantly recognise and respect your higher understanding of their critical issues.
Again, great article, excellent logicical writing, let us see an expanded version soon,
Abdulrahman Alkhowaiter
André-Michel Ferrari says
Thanks for your valuable comments Abdulrahman. And also your suggestion. I have made a note of your suggestions to update the paper. I have to reflect on it; another idea would be to write a new article based on all the great ideas you came up with (e.g. nuclear industry). Thanks again for your interest in my work.
Telkom University says
What are the key parameters considered in RAM models when evaluating equipment redundancy?
André-Michel Ferrari says
Hello Hendrick. Sorry for late response. Please find attached my answer to your question.
• Equipment life characteristics (life analysis, repair distributions)
• Operating philosophy (e.g. how many blocks required to run versus other standby block – or k out of n configuration
• Cost of equipment purchase
• Operational costs (maintenance, repairs, spare parts and utilities if need be)
• Throughput (e.g. required production flow through the equipment)
• Revenue generated from every unit of production
• Other cost such as downtime penalties