For Reliability Engineers to converse with one another and with non-technical people in an organization, it is necessary for the language of reliability to be widely understood. These terms form the backbone of a working vocabulary and should be well understood by Certified Reliability Engineers.
Availability: Fraction of time that a system is usable. Steady state Availability = MTBF/(MTBF+MTTR)
Bathtub Curve: Named for the curve’s shape – a plot of the Hazard Rate over time, showing the early-life failures and wear-out failures adding to the constant, useful-life failures.
BIST: Built-In Self Test: embedded software that examines hardware functionality.
DPMO: Defects per Million Opportunities: the Six Sigma way to express defect rate or failure rate.
Early-Life Failures: Failures occurring in the first 90 days of product life, these are caused by hidden defects. As the failures are repaired, the product population’s Hazard Rate decreases.
ESS: Environmental Stress Screening: manufacturing process where 100% of production products are subjected to stresses beyond specifications, to precipitate hidden defects. Detecting and repairing these (now visible) defects eliminates Early-Life Failures.
Failure: Not performing intended functions or not adhering to applicable specifications.
Failure Mechanism: A degradation process (e.g. corrosion, wear) that causes a failure.
Failure Mode: The results of a failure mechanism. The way a degradation process causes a particular system or component to fail (e.g. open circuit resistor).
Failure Rate: Ratio of device failures per unit time to the number of devices that could have failed. Field failure rates are typically expressed in percent per year.
HALT: Highly Accelerated Life Test: subjecting a few prototype units to stresses well beyond specification, to expose failure modes that will occur when many devices are run for an extended time at customer stress levels.
HASS: Highly Accelerated Stress Screening: a special version of ESS which exposes 100% of production product to stresses beyond operational limits to more quickly precipitate latent defects, and then runs product within operational limits to detect these precipitated (now visible) defects.
Hazard Rate: Conditional failure rate: the probability of a device failing per unit time given that it hasn’t failed yet. When reliability is high, this is approximately the failure rate.
Load: General term for stress applied to a component. This could be the torque on a bolt, current through a diode, voltage on a capacitor, wattage dissipated in a resistor, etc.
MEOST: Multiple Environment Over Stress Test: synergistic life test where several stresses are applied at the same time, increasing proportionately as the test progresses. Weibull Analysis is used with the percent of maximum customer stress as the x-axis.
MTBF: Mean Time Between Failures: a measure of repairable system reliability, typically expressed in hours or months, equal to the reciprocal of the Hazard Rate when it is constant (when failures are random in time – the exponential distribution).
MTTF: Mean Time To Fail: a measure of a non-repairable component’s reliability, typically expressed in hours or months. The determination that a component is non-repairable is often an economic choice rather than a physical impossibility.
MTTR: Mean Time To Repair: a measure of repairable system serviceability, typically expressed in minutes or hours. The time from when a problem is discovered until the system is usable, although in some industries this is more narrowly defined.
Reliability: The probability that a device will perform its intended function for a specified time interval, under stated environmental conditions and use conditions.
Screening Regimen: The sequence of temperature cycles, vibration spectra, voltage variation, power cycles, etc. Administered to 100% of production product during ESS or HASS.
Useful-Life Failures: Failures occurring randomly in time throughout product life, caused by too little design margin between component strength and application load. These create a constant Hazard Rate, which forms the flat base of the Bathtub Curve.
Wear-Out Failures: Failures occurring more frequently as time progresses. History is being recorded in the affected component or part; some process (e.g. corrosion, fatigue, wear) is consuming a reservoir of material (e.g. volume of lubricant, material thickness). These create an increasing Hazard Rate.
Weibull Analysis: Finding the best fit of a family of curves (Weibull functions) to a set of data by adjusting two or three parameters to optimize the goodness-of-fit. When applied to life data for a single failure mode, the nature of the failure (early-life, useful-life, or wear-out) can be deduced, leading to the appropriate corrective action arena.
This is not a complete list and we can add to this list or add posts with other lists as we go along. This short list is just to get you started.
Leave a Reply