The LinkedIn ASQ RRD group published this question from a reliability manager. Replies included:
- “Beta (shape parameter) should be close to 1 for more useful life. But it should not be less than 1.”
- “For Beta you would like to get as close to one as possible.”
- “A Shape of 1 within warranty is good.”
- “It depends on B2B, yes it should be close to 1 that’s within warranty.
Reliability is a function of time or age!!!
When Weibull shape parameter beta =1, there is a constant failure rate. Wouldn’t you prefer to minimize the failure rate and extend product life beyond warranty? A better Weibull shape parameter value depends on the time or age at which you would like to maximize reliability.
Table 1. Spreadsheet snippet of the best shape parameter, beta, for specified life and scale parameters for Weibull reliability function R(t).
Parameter | Value |
Time or life, days | 365 |
Eta | 1000 |
Beta | 300 |
Reliability R(t) | Exp[-(365/eta)^beta] = 1.0 |
∂R(t)/∂beta | 1.4E-179 |
I.e., make beta as large as possible to postpone failure by maximizing R(t) = Exp[‑(t/eta)^beta] for fixed age or time t [Morice]. That maximum is where the derivative of reliability with respect to beta is zero,
∂R(t)/∂beta)=-(((t/eta)betaLog(t/eta))/Exp(t/eta)beta).
Making beta equal 300 achieves reliability of 1.0. My spreadsheet blew up if beta is much larger than 300.
Beta = 300 is ridiculous. Perhaps the reliability manager should ask, what Weibull parameters minimize the discounted, net present value of future costs? Scrap? Rework? Infant mortality? Premature wearout? Uncertainty? Infant mortality could make beta less than 1.0. Others have used proportional hazards Weibull to fit engine data [Jardine et al.]. Don’t do that if you see Weibull plots with infant mortality as in figure 1.
Does your product have a warranty? Do returns bunch around warranty? If so, does a Weibull failure rate function fit that glitches in the failure rate function around the warranty, around inspection or preventive maintenance times? Retirement and attrition could make the failure rate function decrease after wearout [George 1994]. Weibull does not fit those unreliability phenomena. How should we allocate resources to control non-Weibull reliability parameters?
Why not use Nonparametric Reliability Estimates?
Nonparametric reliability estimates don’t have parameters! But they actuarial failure rates, a(t) = (R(t-1)-R(t))/R(t-1). Infant mortality shows up in a(0), a(1),… WEAP (Warranty Expiration Anticipation Phenomenon) shows up around a(12) if warranty is 12 months. PMs show up periodically. Wearout shows as an increasing actuarial failure rates. Attrition and retirement show as a decreasing actuarial failure rates in old age, maybe after wearout. Actuarial failure rates are parameters, so optimize them, including uncertainty.
How is Your Portfolio Managed?
Regard alternative reliability improvements as a portfolio of alternative investments. Should you try to fix infant mortality? Reduce constant failure rate? Reduce premature wearout? Reduce uncertainty in spare parts’ demands? Budget is limited. This is a typical stochastic programming question. George Dantzig gets credit for describing stochastic programming in 1955. (He may have mentioned that in our UC Berkeley 1965 Linear Programming class.) See the reference by Infanger for a collection papers on stochastic programming, portfolio analysis, stochastic optimization, risk vs. uncertainty, etc. The rest of this article is about stochastic optimization of reliability instead of trying to guess a Weibull parameter.
When I was in the UCLA MBA program, someone told me to read Harry Markowitz’ thesis. Markowitz worked in the UCLA Econ department, upstairs from the MBA classrooms. So I read his thesis. It proves that a diversified portfolio could earn the same rate of return with lower variance than a single investment with the same specified return rate, because of correlations among alternative investments. This led to “Modern Portfolio Theory” [Modern portfolio theory – Wikipedia] and mean-variance optimization. I didn’t have any money, otherwise I might have been rich today. In the 1980s I wrote a computer program to do mean-variance optimization for Lawrence Livermore Lab retirement plan alternatives and gave the program to friends. Today I use a spreadsheet version [Portfolio Analysis – Invest Excel].
Choosing reliability improvements differs from mean-variance optimization (MVO). In MVO, you specify the portfolio mean return rate that you want, and MVO finds alternative investment percentages to minimize the variance of portfolio return rate. In reliability improvement decisions, you might want to allocate resources to optimize the mean returns rate (complaints, failures, repairs, replacements, recalls, etc.) AND returns’ variance, subject to a budget constraint. E. g., optimize expected demands for product services AND the demand variance (uncertainty).
Proportional Hazards (PH) Model Quantifies Controllable Variables?
A PH failure rate function model is a(t;Z)= ao(t)Exp[βZ] where ao(t) is an underlying, common failure rate function and Z is a vector of “concomitant” variables representing different generations or alternative designs, processes, environments, or failure modes. Imagine several generations of products designed the same way by the same designers, produced by the same processes, shipped and installed the same way, and sold to the same customers in the same environments. They will probably have proportional failure rate functions.
“Credible Reliability Prediction” [George, 2020] uses a simple PH model to predict a new product’s failure rate function, λ(t;new) ≈ λ(t;old)*MTBF(old)/MTBF(new). This is because, during design of a new product, all that is known about the new product is MTBF(new) and the failure rate function(s) and MTBF(s) of older generations of similar products or parts.
“Concomitant” (“naturally accompanying or associated”) variables may be controllable variables more directly related to reality, alternatives, or actions than Weibull parameters. Imagine that recurrent processes (renewals [George 2008], repairs, replacements, etc.) have underlying TBF (time-between-failures) failure rate functions with proportionality factor equal to ao(t)Exp[∑β*i] where i is the i-th renewal and ao(t) is the failure rate function of TTFF (time-to-first-failure).
What is “Better Reliability”, Your Reliability Objective?
Optimizing reliability at one age is short-sighted. Optimizing reliability over a product’s useful life seems more reasonable. How to translate that into an objective? Perhaps you should ask, what parameters (actuarial failure rates ao(t) or concomitant variable values Z) minimize the discounted, net present value of future costs over some accounting or budget time interval?
Consider the objective to be the expected number of returns or failures in some future time or accounting interval. Use an actuarial forecast of the number of failures in zero to age t, ∑a(t-s)n(s), s=0,1,2,…,t, where n(s) is the installed base of age s. The actuarial forecast is an estimate of the mean number of failures.
Optimize allocation of money to actuarial rate estimates subject to budget constraint $B by equating marginal rates of return, (∂∑a(t-s)n(s)/∂a(t-s))*(∂a(t-s)/∂money) aka “bang-per-buck”, so that total expenditure is $B [Marginal rate of substitution – Wikipedia]. The forecast derivative ∂∑a(t-s)n(s)/∂a(t-s) = n(s); that means start spending on the actuarial rate a(t-s) corresponding to the largest n(s)*(∂a(t‑s)/∂money), the combination of cohort size n(s) and cost to change its actuarial rate (∂a(t-s)/∂money). Remember the law of diminishing marginal returns! I did this for redundancy allocation in series-parallel systems [George, 2004 article and workbook available from author]. It was based on work by Alice Smith [Coit and Smith].
How to Improve a Failure Forecast and Reduce its Variance?
Have you ever wondered how to optimize some function of reliability AND minimize uncertainty about that function of reliability? Someone claimed to find factors that reduced mean the most AND reduced variance the most. Why be concerned about variance? FUD (Fear, Uncertainty and Doubt) and VUCA (Volatility, Uncertainty, Complexity, and Ambiguity)!
Failures may require spares, and variance of the forecast induces overstocking spares to avoid backorders. Stock level = failure forecast + safety stock at cost equal to stock level plus holding costs, including costs of unsold or unused safety stocks. So why not optimize average failures and the variance of the number of failures by changing the failure rate function.
The variance of an actuarial forecast is
∑VAR[a(t-s)]n(s)2+2∑COVAR[a(t-s),(a(s)]n(s)n(t-s).
Derivatives of the actuarial forecast variance, (∂VAR[∑a(t-s)n(s)]/∂a(t-s), depend on the variances of the a(t-s) estimate and its covariances. I use the bootstrap or Cramer-Rao bound on the variance-covariance matrix of maximum likelihood estimates.
Future failures create demands for replacements, repairs, spares, or other services. The PH Model includes possibly controllable factors, “concomitant variables”, Z(j), in the actuarial rate function, a(t; Z)=ao(t)Exp[βZ], where βZ=∑β(j)Z(j). A PH actuarial demand forecast is an estimate of the mean number of failures, ∑a(t-s; Z)n(s).
Use ∑d(t-s; Z)n(s) for recurrent processes with actuarial demand rates d(t-s; Z)).
Allocate some of the budget $B to the common failure function ao(t) according to the allocation in the previous section. How would you allocate money to reducing the forecast and its variance as a function of the concomitant variables Z? Find the values of concomitant variables that minimize the forecast mean AND its variance given budget $B.
The variance of demand for a PH model is ∑VAR[ao(t)Exp[βZ]]n(s)2+
2∑COVAR[ao(t‑s)Exp[βZ],a(s)Exp[βZ]]n(s)n(t-s).
If actuarial rate estimates of ao(t) and of β are independent, factor Exp[βZ] out of the variance and covariance terms. That is, Exp[βZ])2 times
∑VAR[ao(t)]n(s)2+2∑COVAR[ao(t-s), (a(s)]n(s)n(t-s)}.
This is because Var[cX]=c2Var[X] for a constant c. So the derivative of demand variance with respect to Z(j) is 2(Exp[βZ])β(j) times the variance-covariance without Exp[βZ])2.
You have to estimate the cost per unit change in the presumably controllable variables Z(j), ∂Z(j)/∂money) . (I once asked a teacher whom to ask for those cost rates. He replied, “Ask your accounting department. FDL RITA (Falling Down Laughing, Rolling In the Aisles). Accounting departments are not accustomed to those kinds of questions.
Use your cost estimates and the rates of changes of the demand forecast and variance to optimize allocation of money to variables Z(j) subject to budget constraint $B. Adjust Z(j) to equate ∂∑a(t-s; Z)n(s)/∂Z(j)(∂a(t-s; Z)/∂money), marginal rates of return, AND
(∂VAR[∑a(t-s; Z)n(s)]/∂Z(j)(∂a(t-s; Z)/∂money), the marginal bangs-per-bucks of the forecast and its variance, so that total expenditure is $B [Marginal rate of substitution – Wikipedia].
The actuarial forecast ∑a(t‑s)n(s) is a regression model where the actuarial rates a(t-s) are regression coefficients. So the variance of the actuarial forecast is also the standard error of the regression model induced by actuarial rate estimate variance. That variance could be approximately extrapolated from ships and returns (or failures) counts as the standard error between observed returns and hindcasts.
Derivatives with respect to Z(j) are:
Mean: ∂∑a(t-s; Z)n(s)/∂Z(j) = [∑ao(t-s)Exp[βZ]n(s)]β(j), as long as the Z(j) concomitant variables Z(j) don’t depend on time, age, or ao(t) estimates. I.e., the derivatives of the mean (forecast) are the forecasts times the concomitant variables’ coefficients. The rates of change of the mean with respect to the concomitant variables Z(j) are their coefficients β(j). So marginal cost for changing Z(j) is β(j) times the bang-per-buck for changing Z(j), (∂Z(j)/∂$$).
Variance: ∂(∑VAR[a(t-s; Z)]n(s)^2+
∑∑COVAR[a(t-s; Z),(a(s; Z)]n(s)n(t-s))/∂Z(j)
depends on the VAR[] and COVAR[] terms, because in Exp[βZ], the β-vector is an estimate. The variance of the actuarial rate estimate a(t-s; Z)] = ao(t)Exp[βZ] includes variability in ao(t), in β, AND their COVAR[ao(t), β(j)]! I use the Cramer-Rao bound on the variance-covariance if I have grouped life data and the actuarial rates from the Kaplan-Meier reliability estimator. Warning: do not use the asymptotic Greenwood variance estimator of the Kaplan-Meier reliability estimator for finite samples! [George 2023] I use the Cramer-Rao variance-covariance bound or bootstrap if I have ships and returns counts.
Recommendations?
If you still want estimates of Weibull reliability parameters from periodic ships and returns counts, send data to pstlarry@yahoo.com, and I will send back the maximum likelihood or least squares parameter estimates. Please specify whether returns are dead-forever (max. likelihood) or recurrent event count (least squares).
This article proposes to allocate money to minimize expected future demands or returns AND minimize the variance of demands or returns. You’ll have to estimate the marginal costs of changing controllable variables on future returns and costs of its uncertainty in money. I did redundancy allocation for parallel systems with budget constraint [George 2004], but not with estimates of reliability or failure rate functions.
I have not done all the work proposed in this article, because I don’t have data or a need, but I am willing to help you allocate resources optimally to minimize an objective and its variance.
Reference
David W. Coit and Alice E. Smith, “Reliability Optimization of Series-Parallel Systems Using a Genetic Algorithm”, IEEE Trans. Reliab., Vol. 45, No. 2, 1996
Gerd Infanger, editor, “Stochastic Programming: The State of the Art, In Honor of George B. Dantzig”, https://web.stanford.edu/~bvr/pubs/InfangerDantzigStochasticprogramming.pdf/, Springer Science + Business, Media, 2009
Jardine, A. K. S., Anderson, P. M. and Mann, D. S., “Application of the Weibull proportional hazards model to aircraft and marine engine failure data,” Qual. Reliab. Engng. Int. Vol. 3, pp. 77–82, doi:10.1002/qre.4680030204, 1987
E. Morice, “Quelques Problemes d’Estimation Relatifs a la Loi de Weibull”, Revue Statistique Appliquée, Vol. 16, no. 4, pp. 43-63, Quelques problèmes d’estimation relatifs à la loi de Weibull (numdam.org), 1968
References by George
“The Bathtub Curve Doesn’t Always Hold Water,” ASQC Reliability Review, Vol. 15, No. 3, pp. 5-7, Sept. 1994
“Redundancy Allocation for Reliability with a Cost Constraint,” unpublished, 2004, available from author along with an Excel workbook
“Credible Reliability Prediction” 2nd edition, Field Reliability – Credible Reliability Prediction (google.com), March 2020
“User Manual for Credible Reliability Prediction,” Field Reliability – User Manual for Credible Reliability Prediction (google.com), Dec. 2019
“Estimate Renewal Process Reliability without Renewal Counts,” ASQ Tech Briefs, Vol. 2, Nov. 2008
“Variance of the Kaplan-Meier Estimator?” Weekly Update, Variance of the Kaplan-Meier Estimator? – Accendo Reliability/, 2023
Jamie Buck says
Very interesting article, lots to digest here, thanks Larry.
From your experience I’d like to ask how often you see such methods applied within industry? More often than not the enabler to implementing such a method is through the use of commercial reliability software packages (due to the complexity of both understanding and coding the underlying theory) and whilst some packagrs include optimisation methods for spares holdings and the like, I have personally not come across anything as comprehensive as this (however I may not have come across these tools yet).
Thanks, Jamie
Larry George says
Thanks. I fixed the article so formulas read better. The symbols [three dollar signs] rendered equations badly so I changed [three dollar signs] to “money”.
Furthermore, I think I will try to program the simultaneous allocation of money to MTBF and variance of some measure such as forecast of returns or other consequence of reliability.
Best wishes for reliable products,
Larry George