Guest Post by Carl Carlson (first posted on CERM ® RISK INSIGHTS – reposted here with permission)
Destiny is no matter of chance. It is a matter of choice. It is not a thing to be waited for, it is a thing to be achieved.” – William Jennings Bryan
The Oxford English dictionary defines “reliability” as “the quality of being able to be trusted to do what somebody wants or needs.”
The textbook definition for “reliability” is “the probability that an item will perform its intended function for a designated period of time without failure under specified operating and environmental conditions.”
In this article, I will share a brief outline of the current and future state of reliability engineering, what works and doesn’t work, and why it matters to all of us.
The Field of Reliability Engineering
There are two primary bodies of knowledge in the field of reliability:
- Reliability engineering is concerned with knowledge and application of specific tasks and methods that achieve high reliability
- Reliability management is concerned with how to establish and successfully execute reliability programs that create safe and reliable products
Reliability engineers provide essential support to all engineering fields, including automotive, electronic devices, energy, food services, infrastructure, utilities, and others. Their objective is to help create products that are highly reliable (trouble-free) throughout their entire useful life.
One of my first jobs in the field of reliability was managing a test lab in the 1980s. We tested vehicle door systems and components, against reliability requirements. When I started, I was still learning what “reliability” meant. I noticed that each year, development times were shortened to be more competitive, the budget was tightened to be more profitable, and the target for reliability was higher to meet customer expectations.
This presented a challenge: How do you 1) meet fast time-to-market, 2) stay within tight budgets, and 3) exceed customer expectations for reliability. My boss told me that we can only get two of the three. That’s what he said. He told me that reliability costs money, and is a trade-off with product development cost and timing. My instincts told me it was possible, and even necessary, to meet all three challenges. At the time I just didn’t know how.
I came to the conclusion that to overcome these challenges, and manage increasing product complexity, we needed to change how we develop products. Below are are a few of the strategies adopted by many of the best companies that helped to address these challenges.
- Shifting from silo-budgeting to life-cycle budgeting. This allowed up-front reliability efforts to be costed over the life of the product.
- Getting off the test-and-fix treadmill. Test-and-fix was a strategy that was necessary if the inherent product design was unreliable.
- Moving to a more science-based approach to reliability, focusing on the mechanisms of failure and understanding the physics behind why things fail.
The Early Reliability Vision
In the early 1990s, I became a manager of product reliability for a large automotive company. My responsibilities included helping young engineers develop their careers. One of them asked me about future opportunities in reliability engineering. I thought for a minute, and proceeded to share my thoughts on the value of providing reliability support to product teams. At the end I added, “in essence, you are trying to work yourself out of a job.” The engineer looked puzzled. So, I drew an illustration, shared below.
In those early years, reliability was an after-thought. Design engineers made their designs and reliability engineers predicted the outcome. Yes, there were PhDs and lots of calculations. But little was done to incorporate reliability into the actual design. I remember asking myself, should I spend my time calculating the reliability of a poor design, or would my time be better spent helping to improve the inherent design reliability.
What follows is a computer rendition of the handwritten illustration I provided the young engineer, demonstrating what I was trying to convey. Remember, this was 30 years ago . . . physics of failure modeling was just beginning, Highly Accelerated Life Testing (HALT) had not yet been broadly accepted, and Artificial Intelligence was but a distant dream.
What did I mean when I said “you are trying to work yourself out of a job”? I meant that the reliability engineer needed to work less on doing reliability predictions, and more on getting reliability methods integrated into the design process. The future vision included shifting the focus of achieving reliability from an after-thought calculation to an inherent design capability. This became known as Design for Reliability. This did not mean there would never be a need for reliability engineers; it meant they had to work with design teams to achieve good designs, good manufacturing processes, and good maintenance.
Thus began my long journey to help transform reliability engineering from reactive to proactive. And, today, that journey is continuing and even accelerating.
How Do You Achieve High Reliability?
Until recently, reliability engineers in many companies primarily provided analyses and calculations. They answered questions like: What is the prediction for reliability? Are we good enough to pass the next gate? How much warranty budget will we need? However, answering these questions is not sufficient to bring about high reliability with systems that are getting more complex by the day, and that have shorter development times and tighter budgets. To do this, a change in thinking has to occur.
Below is a chart I used when I started teaching reliability engineering. I call it the “Reliability Equation.”
Where should the emphasis be? On the left side or the right side of the equation?
I believe the best use of time and energy is on the right side of the equation. This means reliability engineers must use more than analytical skills. They also need to use influencingskills. And, they need to understand the underlying mechanisms of failure, so they can work with engineering teams to achieve safe, reliable and economical products and processes.
Fred Schenkelberg and I are writing a book about achieving high reliability. It’s a guide for engineers and managers, sharing our years of experience. Below is an illustration from the book, which includes many examples. Each of these steps is a team effort, involving reliability engineers as part of engineering teams.
Three important statements summarize the best practice reliability philosophy of successful companies. The concepts are not trivial and comprise a high-level outline for the future of reliability engineering as a viable discipline.
- Reliability must be designed into products and processes.
For complex systems, this will involve modeling of systems, and using specialized tools to achieve robust and failure-free system and component designs. In the future, with systems relying more and more on software, the field of software reliability needs to expand from the purview of a small number of experts to mainstream software development.
- Knowing how to calculate reliability is important, but knowing how to achieve reliability is more important.
Achieving high reliability in the future world will require expanded partnerships with design teams to influence the design process. This may involve co-location of reliability engineers with design teams (virtually or in-person), so they can help project teams make the best possible decisions. Critical to success is that reliability engineers need to focus their efforts on the “vital few” methods and tasks that best achieve reliability objectives. See “The Vital Few” below.
One of the changes that needs to take place involves the metric for measuring and characterizing reliability. It is past time to retire the use of Mean Time Between Failures (MTBF) as a primary metric for reliability specifications. It often has wrong assumptions and does not support good decisions. Much has been written on this subject on AccendoReliability.com.
- Reliability practices must be integrated into overall product development and design processes.
Reliability is not an office down the hall. It is a mind-set, using a set of tools that integrate with the engineering of systems. For example, machine learning and artificial intelligence are part of the future of engineering. These strategies will need to include reliability capability, which requires reliability engineers having a seat at the AI table. Engineering management will need a working knowledge of how to integrate reliability methods into the job descriptions of their engineers. This includes integrating reliability methods with risk management techniques. After all, the language of management is risk, not reliability.
All of this means the future for reliability engineers involves a balance between technical skills and communication skills. It will take work, because communication skills are not well taught in university engineering curricula. Fortunately, they can be learned.
A word on improving communication. The future of work will require practitioners who are adept at communicating both in-person and virtually. A large part of communication involves the ability to read and respond to body language. As we advance the technology for our virtual connections, it is essential to enable visual clarity of the people who are communicating with each other, to avoid missing vital cues, and to get people deeply involved in collaborative work processes.
The Vital Few
“Things which matter most must never be at the mercy of things which matter least.”
― Johann Wolfgang von Goethe
The field of reliability has an abundance of methods, some of which support the future world better than others. For example, methods such as Failure Mode and Effects Analysis (FMEA) and Highly Accelerated Life Testing (HALT), support the transition to Design for Reliability, and need to be reinforced. Others, such as Reliability Growth Analysis, which relies heavily on test-and-fix, need to be streamlined or replaced.
Shifting the focus to designing in reliability and maintainability means competition for the time and attention of product engineers, process engineers, maintenance engineers, and project teams. It is absolutely essential to prioritize tasks and maintain focus on the “vital few” methods and tasks that achieve objectives. It is not enough to make long lists of things to do. Reliability engineers must identify and execute what is most vital and beneficial to the engineering teams.
The future for reliability engineering, as well as most all of engineering, involves smart use of artificial intelligence, modeling, automation, and virtual tools. Systems are getting more and more complex, and these tools are essential to optimize designs. However, there will always be a need for teams in achieving high reliability. People have “blind spots,” and well-defined cross-functional teams minimize errors inherent with “blind spots,” and discover things that individuals and automation can miss. In addition, by following the best practice reliability philosophy of successful companies outlined above, reliability engineers and management can help meet the challenges and ensure future products are safe and reliable.
© 2021 Carl S. Carlson
Carl S. Carlson is a consultant and instructor in the areas of Failure Mode and Effects Analysis and other reliability and quality disciplines, supporting clients from a wide cross-section of industries. He has over 35 years of experience in reliability testing, engineering, and management positions, including senior manager for advanced reliability at General Motors. He has a bachelor’s degree in mechanical engineering from the University of Michigan. He is a Senior Member of American Society of Quality, and a Certified Reliability Engineer. His book, Effective FMEAs, was published by John Wiley & Sons in 2012, and he regularly writes and podcasts on AccendoReliability.com.
Contact Carl at email@example.com with your comments, ideas or questions.