A recent forum post included the notion that many engineers and managers developing products or maintaining equipment have knowledge apathy concerning reliability. “They don’t care!” was the poster’s words. Has this happened? Have we lost the ability to care about reliability?
In a course I teach on reliability engineering management I ask my students to find an advertisement using reliability as a central theme or claim. This isn’t very hard to do, and I’ve regularly been surprised at the range of uses advertising finds around the concept of reliability. ‘Reliable Movers’ claims to reliably and safely move your belongings to your new home. A reliable shotgun ammunition-loading device suggests each shell will fire reliably. And, many other advertisements use the basic concepts of consistent, repeatable, safety, and trustworthy via the term reliable or reliability. There is a common and good association with reliability.
The engineering definition of reliability is similar, yet very specific:
The probability of successful operation or function over a defined period, in a specified environment.
There are only four elements: probability, duration, function and environment.
Most agree this is correct and useful. The function is often well understood, and it is clear when the equipment is working or has failed. The environment includes the weather and use profiles, and is also generally well understood. Duration is the expected time period of use. It is the probability term that tends to muddle the understanding.
Is statistics to blame for the apathy? I would suggest not entirely. Some of it could be the delay of feedback to design and maintenance teams. It takes time to find out if a product actually meets the reliability objectives. If a solar panel is designed to produce power with 80% efficiency for 20 years, it will take 20 years to determine the actual performance.
Another factor contributing to reliability knowledge apathy is the rewards system in many organizations. In some organizations, the hero that steps in to fix a major problem is visible, recognized and rewarded. Whereas, the engineer that identifies and avoids major failures is just doing her job. She may have saved the organization millions of dollars in warranty expenses, yet her actions are often unseen, rarely recognized and hardly rewarded. People do work for status, and being a hero brings status. That tends to encourage the apathy behavior, as without errors to fix there are fewer chances to be the hero.
Another factor may be the specialization of engineering work. The mechanical engineer focuses on material strength, fastening options, and efficient transfer of energy or motion. The electrical engineer focuses on power consumption, circuit speed, and timing. A system’s fan is often overlooked as a component other than airflow to produce cooling for the system. The fan is a complex electromechanical device that is neither the domain of electrical engineers – they provide it power and benefit from the cooling; nor the domain of mechanical engineers – they provide support, attachment, and location. Neither spends the time to address the selection of the fan related to the fan’s reliability, which results in fans being a common element of the design that fails. Neither specialized engineer has the knowledge to address the cross-discipline elements at play in the fan.
Of course time to market, throughput, cost and management priorities all contribute to the apathy. The rewards, directives, encouragements and priorities often do not include any aspect of reliability.
Building Reliability Knowledge
So, what do we do? How can we diminish the apathy around reliability?
I do not exactly know and have only a few suggestions. As I continue to work with teams and learn what works to instill a passion for reliability I’ll update this blog, yet in the meantime, here’s what I’ve seen make at least some difference:
- Awareness – let others know the engineering definition of reliability and articulate it clearly for your projects.
- Value – use reliability-engineering tools that enable decisions.
- Goals – set and track progress toward meaningful reliability objectives.
- Success – document and celebrate the successful avoidance of reliability problems (not just the hero).
- Math – embrace the math around reliability statistics. Talk with data.
- Generalize – work with specialists to find gaps in system reliability understanding.
There are more ways to avoid or reverse reliability knowledge apathy. What have you found that works? What are your success stories?