Why Use Nonparametric Reliability Statistics?

Fred asked me to explain why use nonparametric statistics? The answer is reality. Reality trumps opinion, mathematical convenience, and tradition. Reality is more interesting, but quantifying reality takes work, especially if you track lifetimes. Using field reliability reality provides credibility and could reduce uncertainty due to tradition and unwarranted, unverified assumptions.

Data is inherently nonparametric. Cardinal numbers are used for period counts: cohorts, cases, failures, etc. Accounting data is numerical; it is derived from data or from dollars required by GAAP (Generally Accepted Accounting Principles); e.g., revenue = price*(products sold), service cost = (Cost per service)*(Number of services), or numbers of spare parts sold. Why not do nonparametric reliability estimation, with or without lifetime data?

What is Nonparametric Estimation?

Nonparametric estimation makes no distribution assumptions. Nonparametric distribution estimates include: empirical (all data) and histograms (binned data). Smoothing methods are also nonparametric; e.g., smoothed kernel and kernel mixtures.

Nonparametric estimation using lifetime data has a long history:

Life tables for retiring Roman Legionnaires (Ulpianus, murdered AD 228)
Life insurance and actuarial statistics (Edmund Halley 1880). Railroads used actuarial rates in 1920s. The Air Force Air Material Command uses “actuarial methods” developed by Rand Corporation 50 years ago for engines and expensive subassemblies.
Reliability software uses installed base, grouped ages-at failures, and right-censored survivors’ ages [Kaplan-Meier]
Parts’ reliability estimates from computer ships and service parts’ returns counts [Apple, 1990]

Walter Shewhart’s rule # 1: “Original data should be presented in a way that will preserve the evidence in the original data for all the predictions assumed to be useful.” Field reliability data is discrete, perhaps periodic. Nonparametric estimation preserves allinformation in data (unless binned or smoothed) [Kalaiselvan and Rao, Hollander and Pena, Gámiz et al.], whether lifetime data or some other kind of data containing reliability information.

We Live and Work in Calendar Time

There are many reasons for age-variable failure rate. Accounting records are in calendar time; inventory events occur in calendar time; inspection and maintenance actions are often in calendar time. Constant failure rate per hour usually translates into non-constant failure rate in calendar time.

Does your product or part have…?

Warranty? Warranty Expiration Anticipation Phenomenon? Warranty replacements are new, hysterecal, or service warranty? [George WEAP]
Constant failure rate in operating hours? (usually translates into non-constant failure rate in calendar time)
DoA. Zero-inflation (initial failures)? [Bhattacharya et al., Labrecque-Synnott and Angers]
Sell-through time? Mixed with infant mortality?
Return time? Delayed Reporting? (A customer told me, “We batch (sterile glove) failures and return them when we feel like it.”)
Lost (left-censored) data? (“Belgian AF bought used C130Hs…”)
Preventive maintenance? Detects prior failures? Causes more failures?
Seasonality? Attrition? Retirement? [George and Langfeldt]
End-of-life support plan? How to get old customers to buy new products?
Cohort variability? (Simpson’s paradox)

If so, the failure rate function has glitches visible only if you estimate nonparametric failure rate functions. Real failure rate functions aren’t constant, continuous, and smooth; those bullet items induce variations in age-specific failure rate functions and make reliability interesting. Wouldn’t you like to quantify these variations, evaluate alternatives, and verify fixes? With population data?

Despite what people say (and standards such as ISO 14224), lifetime data is NOT required to estimate nonparametric reliability and survival functions! GAAP requires statistically sufficient data to make nonparametric reliability estimates, without lifetime data. Revenue implies product sales (“ships”), service costs imply returns, Part failures imply spares sales, gozinto theory implies parts’ installed base by age. Ships and returns counts are statistically sufficient, and they’re population, not sample, statistics. Brandolini’s law says, “The amount of energy required to refute BS is an order of magnitude bigger than to produce it” [Collins].

Maximum likelihood and least squares nonparametric estimators are used for dead forever failures, renewals (good-as-new), relevation (good-as-old), hysterical (hysteresis modifies successive times-between-failures), and non-id renewals, even with missing or “masked” data. Nonparametric estimators are usually asymptotically unbiased, minimum variance Information (inverse entropy) comparable with Kaplan-Meier estimators with equivalent censoring.

Here are some examples of reality: coping with phenomena that would invalidate parametric reliability functions.

Read this and WEAP?

Warranty Expiration Anticipation Phenomenon (WEAP) causes a spike in returns counts around the age at which warranty expires. A subsidiary of Amazon had to fudge warranty returns forecasts because Weibull didn’t account for early returns.

Figure 1. Weekly actuarial return rate estimates. Warranty is 52 weeks.

DoAs?

Does your Weibull software account for a mixture of failures on the first cycle and subsequent Weibull failures. Figure 2 shows Weibull probability paper that should fit a mixture distribution F(t) = α*F(0)+(1‑α)*(1‑Exp[‑(t/η)^β])

Figure 2. Weibull probability paper shows failure on first cycle.

Belgium AF bought Used C130Hs

Mechanic: “The flow-control shutoff valve on the cockpit air conditioner just failed.”

Reliability Engineer: “How many hours were on it?”

Mechanic: “Beats me. This isn’t a safety critical part. We started tracking these air-conditioners in 1998, and this is the first time this valve’s failed since then.”

Reliability Engineer: “How many hours are on the air-conditioner since then?”

Mechanic: “623.6 hours”

Reliability Engineer: “How many flying hours were on the plane when you started tracking their air-conditioners?”

Mechanic: “8082.5 hours”

Reliability Engineer: “How many valves failed before then?”

Mechanic: “Beats me. Could have been some. This valve is the most frequently failing part in the cockpit air conditioner, and its failure causes mission abort. How often should we replace them?”

Prior failure records were unknown due to change of ownership (leased computers, Belgian C130H aircraft). There were seven aircraft in the data. The data for each aircraft consisted of: aircraft flying hours (start time), first valve failure hours (seven records, after start time), uncensored failure hours of subsequent failures, and right-censored survivor hours (seven records). Clearly there were valve failures and replacements before the aircraft were purchased, in addition to 7 left-censored and 20 more uncensored times-between-failures.

Figure 3 shows that ignoring renewals underestimates reliability. The Kaplan-Meier estimator used only the 20 uncensored failure times and the 7 right-censored survival times. Assuming a renewal process allows using the 7 left-censored failure times too.

Figure 3. Reliability function estimates of C130H A/C valve.

Semi-Nonparametric Reliability?

Cox’ proportional hazards (PH) model is called “semi-parametric”. Survival analysts use Cox’ PH models to test effects of factors: Z-vector. Nonparametric hazard rate function is λ(t)=P[t<Life<t+dt]/P[Life>t], any nonnegative function, aka actuarial rate function. Cox proportional hazard (PH) rate function λ(t,Z) = λo(t) exp(Z*β) where λo(t) is the “base” hazard rate function and exp(Z*β) is proportionality factor. GMDH generalizes Cox’ exponential-linear failure rate function from λo(t)Exp[-Σβ*Z] to λo(t)*(IZ) or λo(t)*Exp(I(Z)) where I(Z) is the Ivakhnenko polynomial [www.GMDH.net].

Semi-nonparametric models include mixtures of nonparametric and other distributions to represent initial failures, sell-through time, or return time. Return time delays and spreads out warranty returns. Sell-through time delays and spreads out infant mortality and biases reliability estimates. Figure 4 shows iPod reliability estimate before X+Y and after deconvolution of sell-through time X from reliability Y. I.e., estimate the distribution of Y from data on X+Y (deconvolution), where X is sell-through time and Y is time-to-failure.

Figure 4. IPod reliability estimates with (before) and without sell-through time (after)

Why Not Do Nonparametric Reliability?

Management: What management wants differs from what really happens.

Tradition: People won’t change if change admits that what they’ve been doing has been wrong.

Software: ReliaSoft says, “.. confidence bounds associated with non-parametric analysis are usually much wider than those calculated via parametric analysis.” Reliability confidence bands based on Greenwood variance (from Kaplan-Meier nonparametric reliability) are invalid for more than one age, and Greenwood’s variance errs for finite samples.

Profit: Reliability data vendors in oil and gas industry wrote ISO 14224 to require lifetime data collection for estimating MTBF and Weibull parameters. ISO 14224 says you need lifetime data to do reliability analyses. Auditing requires conformance to standards. CMMS software is easier to program so data vendors and consultants profit.

Work: Nonparametric reliability estimators can be awkward; they stop at age of oldest failure. Suppose you want MTBF? Reliability at ages older than oldest failure? Extrapolate failure rate function with regression and standard error bands. If you want reliability from 1.0 to 0.0, you have to extrapolate and fit some distribution function.

Test: Converting nonparametric Accelerated life test (ALT) results requires an acceleration model [Kalaiselvan and Rao]. Cox’ proportional hazards model requires accelerated failure rate function(s) to be proportional to the unaccelerated, failure rate function. Would you trust ALT if not proportional hazards? HALT distorts the failure rate functions.

Recommendations?

Use parametric reliability models when physics justifies them: crack-growth, fatigue, links of a chain,…. Check parametric model vs. nonparametric. The Kullback-Leibler divergence between parametric and nonparametric reliability estimates measures relative information (bits) in nonparametric estimator. Is the cost of lifetime data worth the information gained vs. ships and returns counts? GAAP data is free.

Use field data, lifetimes, censored or not if you have them, or use ships and returns counts required by GAAP if you don’t have lifetime data. Send field reliability data and describe it if you want help with nonparametric reliability estimation.

Is field reliability changing? Use Statistical Reliability Control (K-L divergence) not RGA (“Reliability Growth Analysis” a la AMSAA and Larry Crow is really MTBF growth.) [George, 2023].

Have you considered multivariate, nonparametric reliability estimation? Customers who replaced part A also needed part B. There’s an R package for bivariate reliability; I used SurvivalBIV on Tesla model S battery, charger, and drive units.

References

Archan Bhattacharya, Bertrand S. Clarke, and Gauri S. Datta, “A Bayesian Test for Excess Zeros in a Zero-Inflated Power Series Distribution,” Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen, Vol. 1, pp. 89–104, 2008

Loren Collins, Bullspotting, Finding Facts in the Age of Misinformation, Prometheus Books, Amherst, New York, 2012

M. Luz Gámiz, K. B. Kulasekera, Nikolaos Limnios, and Bo Henry Lindqvist, Applied Nonparametric Statistics in Reliability, Springer-Verlag, London, 2011

L. L. George, “Estimate Reliability Functions without Life Data,” ASQ Reliability Review, Vol. 13, No. 1, pp. 21^–26, 1993

L. L. George and Eva Langfeldt, “The Bathtub Curve Doesn’t Always hold Water,“ ASQ Reliability Review, vol.14, No. 3, Sept. 1994

L. L. George, “Défaillances Avant Enregistrements, Données Censurées a Gauche, et Fiabilité: que se Passe-t-Il si les Défaillances se Produisent Avant D’être Enregistrées?“ (Failures Before Returns, Left-Censored Data, and Reliability: What If Failures Occur Before Reported?) Phoebus Journal of Reliability, Sept. 2005

L. L. George, “Statistical Reliability Control?” www.accendoreliability.com, 2023

Harris, Carl M. and Edward Rattner, “Estimating and Projecting Regional HIV/AIDS Cases and Costs, 1990-2000: A Case Study,” Interfaces, Vol. 27, No. 5, pp. 38-53, https://doi.org/10.1287%2Finte.27.5.38, 1997

Myles Hollander and Edsel Peña, “Nonparametric Methods in Reliability,” Stat Sci., vol. 19, no.4, pp. 644–651, doi:10.1214/088342304000000521, Nov. 2004

ISO 14224, ISO 14224(2006)E, “Petroleum, Petrochemical and Natural Gas Industries — Collection and Exchange of Reliability and Maintenance Data for Equipment,” the International Organization for Standardization, 2006, 2016

William S. Jewell, “A Survey of Credibility Theory,” UC Berkeley OR Center Research Report 78-31, October 1976

C. Kalaiselvan and L. Bhaskara Rao, “Comparison of Reliability Techniques of Parametric and Non-Parametric Method,” Engineering Science and Technology, an International Journal, Volume 19, Issue 2, pp. 691-699, 2016

Kaplan, E. L. and Meier, P., “Nonparametric Estimation from Incomplete Observations,” J. Amer. Statist. Assoc., Vol. 53, Np. 282, pp. 457–481, doi:10.2307/2281868, 1958

Félix Labrecque-Synnott and Jean-François Angers, “Bayesian Estimation and Testing for Continuous Zero-Modified Models,” March 2010

Oscarsson, Patric and Örjan Hallberg, “EriView 2000 – A Tool for the Analysis of Field Statistics,” Ericsson Telecom AB, http://sgll.nu/Relpub/Ref13ERI2000c.pdf, 2000

ReliaSoft, “Non-Parametric Life Data Analysis,” https://help.reliasoft.com/reference/life_data_analysis/lda/non-parametric_life_data_analysis.html, 2024

Turnbull, Bruce W., “Nonparametric Estimation of a Survivorship Function with Doubly Censored Data,” J. Amer. Statist. Assn.,Vol. 69, No. 345, pp 169-174, March 1974