Proportional Hazards Reliability of Hysterecal Recurrent Processes?

Generations of products have similar field reliability functions because they are designed, processed, shipped, sold, and used in similar environments by similar customers. Replacement parts have similar reliability functions depending on replacement number: 1^st, 2^nd,….

Biostatisticians use David Cox’ proportional hazard (PH) survival function models to quantify effects of treatment or risk factors. Proportional hazard models could describe product’s failure modes, parts’ reliabilities in successive replacements, or products’ reliabilities in successive generations.

Cox’ proportional hazards (PH) failure rate function for a recurrent event process is λ(t;n(t),β)=λo(t)Exp[∑β*j], j=1,2,…,n(t), where n(t) is an event counter and λo(t) is the “base” failure rate function and β is a regression coefficient representing the changes in failure rate functions of successive events j: renewals, repairs, or replacements. [Wikipedia has a lucid explanation of why a PH model is “Proportional”.] Some people call β the “restoration’ factor. In reliability, the regression Exp[∑β*j] could be expanded to multivariate regression Exp[∑∑β(i)*Z(i,j)] with “concomitant” variables Z(i,j), values that describe successive event counter j for failure mode i. Proportional hazards is a testable hypothesis.

Recurrent processes with TTFF, TBF1, TBF2,… with different distributions are called “G-Renewal” Processes [Generalized renewal process – Wikipedia]. Authors have incorporated imperfect repair; TBF after repair could be anything between good-as-new to good-as-old [Ferrari, Kaminsky and Krivtsov]. That has been extended to model repairs that make the product worse than good-as-old. The objective is to quantify the “restoration” factor and the differences between TTFF and subsequent TBFs. I like to call that an “hysterical” process after “hysteresis”, the dependence on history of recurrent processes.

Several papers have incorporated repair in Non-Homogeneous Poisson process models’ of failure or repair times [Arasan and Ehsani]. Others have assumed different distributions for TTFF and TBFs: exponential (NHPP), gamma [Krakowski], and Weibull [Yannaros]. Why didn’t they adopt a proportional hazards model to quantify the deterioration in generalized renewal processes? Fred suggested that [Schenkelberg].

Reliability models for renewable or repairable systems are called “Generalized” Renewal Processes if the Time-To-First-Failure (TTFF) differs in distribution from subsequent Times-Between-Failures (TBF). The term “renewal” means that the age of the system is reset to zero after repair or replacement, [aka “relevation” by Krakowski].

With Grouped Lifetime Data?

Given lifetime data by product, including censored survivor times, it’s easy to estimate renewal processes’ distributions of TBFs [Strawderman and Peña]. But what if there is deterioration after repairs or other phenomena that rule out renewal process? I assumed the Nigerian refinery pumps in table 1 started January 1, 2008 to compute TTFF and successive TBFs.

Table 1 Successive Nigerian refinery pump’s failure dates. Residual lifetimes were not reported.

pump 1	pump 2	pump 3	pump 4
8/1/2008	10/1/2008	6/1/2009	6/1/2008
1/1/2009	11/1/2008	6/1/2010	6/1/2013
5/1/2009	12/1/2008	12/1/2010	3/1/2008
7/1/2009	5/1/2009	7/1/2014	5/1/2008
6/1/2010	6/1/2009	8/1/2014
12/1/2011	4/1/2010	10/1/2014
5/1/2012	9/1/2010
6/1/2012	10/1/2010
5/1/2013	5/1/2011
8/1/2013
10/1/2014
3/1/2015

Autocorrelations of TBFs were not negligible, but not informative, so I treated successive TBFs as statistically independent. I computed the nonparametric empirical and Weibull reliability function estimates for TTFF and TBFs. Weibull was not a bad fit (figure 1), if you must have a model. But what if there is deterioration in TBFs?

pump data and a linear fit — Figure 1. P-P plot compares nonparametric empirical (for each pump) vs. (linear) Kaplan-Meier reliability estimates from all four pumps. Linearity indicates Weibull could be a reliability model. Both axes are cumulative distribution function values.

The PH failure rate function λ(t; n(t), β)=λo(t)Exp[∑β*j] is a natural candidate where j is the renewal counter. Use least squares to minimize the sum of the squared differences between observed and expected failure counts as a function of λo(t) and β. The expected failure counts is called the “renewal” function, E[N(t)]-E[N(t-1)]=E[Failures in (t-1,t)]], are the actuarial hindcasts (forecasts of already observed events) derived from the differences in the “renewal” function E[N(t)], d(t) = E[N(t)]-E[N(t-1)]. E[N(t)] is computed by convolutions of the PH-model distribution functions of TTFF and, TBF1, TBF2,…, TBF(11). Figure 2 shows successive reliability function estimates for TTFF, TBF1, TBF2,…

Table 2. Part of the spreadsheet used in least-squares fit the PH model for 4 refinery pumps. (Obs-exp) = (Observed failures – E[failures]) in each month. The minimization of SSE = ∑(Obs-exp)²/exp was over λo(t) and β for ages 1-24 months. The last column shows the successive TBFs were decreasing, because pump PH failure rate function increased with the number of failures.

Age t, months	Failures	E[failures]	(Obs-exp)²/exp	λo(t)	β	Exp[βt]
0	0	0.32	0.32	0.0787	0.0568	1.00
1	0	0.34	0.34	0.0000		1.06
2	0	0.37	0.37	0.0001		1.12
3	0	0.40	0.40	0.0001		1.19
4	0	0.43	0.43	0.0001		1.26
5	1	0.46	0.63	0.0000		1.33
6	0	0.50	0.50	0.0000		1.00
Etc.

Time series forecasts of failures or spare parts’ demands include seasonal models in a “tournament” of alternative forecasts. Actuarial forecasts, ∑λ(t-s)*n(s); s=1,2,…,t, don’t include seasonality, because actuarial rates aren’t seasonal, unless you include season as a concomitant variable in the PH model λo(t)*Exp[Zβ]. That’s OK for products that stay dead when failed. However, that won’t work for a generalized renewal process after the first renewal. So fit a proportional hazards model to the combined demand rate function d(t; season), do(t)Exp[β*season]. Nigeria has two seasons, dry and wet. The dry season is from October to April (5 months). The dry season had 46.75% of failures and the wet season had 56.25%. Dry season is ~5/12 = 41.67%, so seasonality may not be a significant factor in failures. May and June had the largest numbers of failures.

Table 3. Seasonality of refinery pump failures?

Month	% failures	Season
Jan.	9.38%	Dry
Feb	0.00%	Dry
March	3.13%	Dry
Apr	3.13%	Wet
May	18.75%	Wet
Jun	18.75%	Wet
July	6.25%	Wet
Aug	9.38%	Wet
Sept	3.13%	Wet
Oct	12.50%	Dry
Nov	6.25%	Dry
Dec	9.38%	Dry

reliability function out to 24 with marked decrease about month 11 — Figure 2. Proportional hazards reliability function estimates. Initial TTFF reliability R(t;0) and TBF reliability functions estimates decrease, to R(tl11)

Proportional Hazards Recurrent Process Model? CAUTION NEUROHAZARD!

The “renewal function” is m(t) = E[N(t)] where N(t) is the number of events (renewals) that have occurred by time t [https://en.wikipedia.org/wiki/Renewal_theory]. The “reliability function” is the [complement of] the distribution function of times between renewals [TBFs]. The reliability objective is to estimate that reliability function, assuming TBFs are statistically independent and identically distributed F(t). The renewal function and the reliability R(t)=1-F(t) are related by m(t) = F(t)+∫m(t-s)dF(s) where the integral is from 0 to age t. To estimate reliability 1-F(t) as a PH model, try mPH(t) = Fo(t;1)+∑∫mPH(ts; j)dFPH(s(j),j), where FPH(s,j) = F^*j(s) = the j-th convolution F(s;j) = 1-Exp[-∫λ(u;j)du] = 1‑Exp[-∫λo(u)Exp[∑β*j]du]. Integrate variable u from 0 to s), and mPH(t-s;j) is E[N(t;j)] for the j-th convolution.

If that explanation calls for an explanation, ask for the spreadsheet used to estimate the distributions of TTFF and subsequent TBFs for the 1988 Ford V8 460 cid engine, the last Ford engines to have carburetors. Ford dealers seemed to have infant mortality and didn’t seem to be able to fix it. Dwight Jennings sold his as soon as the warranty expired [George, 2021]. I got the vehicle sales from industry publications. I sent the reliability function estimates to Ron Salzman at Ford, and he sent back the actual vehicle sales and repair counts in table 3.

It’s Easy Even if You Don’t Have Lifetime Data!

Why not make nonparametric estimates of the distributions of TTFF and TBFs without lifetime data? 91% of people in LinkedIn polls don’t believe it can be done! Lifetime data is sufficient but not necessary! [George, July 2023] Periodic cohort sizes and events (complaints, failures, repairs, replacements, spares sales, etc.) are statistically sufficient! Failure events or renewal counts don’t register the original cohort. That doesn’t require tracking each product’s repair or replacement events by product and part names and serial numbers. Ships and returns counts are population data, available from revenue and service costs required by GAAP, with a little work for service parts and BoMs.

Table 3. 1988 Ford V8 460 cid engine data and least squares PH model. Notice that service calls started early and petered out? They continued for five years at lower failure rates.

Age, months	Ships	Failures	λo(t)	β
0	213	18	0.1417	0.1826
1	6439	797	0.5311
2	6951	1291	0.5673
3	5715	1511	0.5525
4	5390	1791	0.4498
5	6336	2282	0.3081
6	6319	2628	0.2109
7	12590	4758	0.1154
8	10314	4604	0.0511
9	10479	5173	0.0103
10	10129	5591	0.0000
11	3150	3414	0.0000
12	2772	4618	0.0000
13	0	Etc.

I adapted Cox’ PH model and estimation to the effect on reliability and warranty service of the 1988 Ford V8, 460 cid engines. [George, Sept. 2021] See how effective the first repair was; i.e., compare failure rate functions of TTFF and TTF1 in figure 3. Successive TBFs, although fewer, mostly occurred within 6 months of previous service, and they were more reliable!

Figure 3. Reliability functions of TTFF and TBFs increase from lowest, TTFF, as numbers of repairs increase.

GMDH PH GRP?

I’ve used GMDH regression models [www.gmdh.net, George Jan. 2022] in other applications but not for PH models of reliability. I would if there were many concomitant variables or failure modes. If you send me data on ships and returns counts, their failure modes, and concomitant variables I will try it: pstlarry@yahoo.com.

Optimal Opportunistic Replacement?

Relevation articles propose replacement of a part with the same age as the system that requires a replacement [Baxter, Belzunce et al.]. That seems kind of silly, because a repair shop objective may not be to restore the product to good-as-old condition. The Kelly AFB engine shop manager said, “Build that sucker so it doesn’t come back here for 600 hours.” The “sucker” he was referring to was a P&W F100PW100 engine for the F15. He wanted to pick used parts off the shelf that matched the residual lives of life-limited parts already on the engine [Favara, US DOT FAA]. That inspired optimal opportunistic maintenance: “What else should we do while we have that sucker torn apart?” [Gertsbakh, George and Lo]

References

Laurence A. Baxter, “Reliability Applications of the Relevation Transform,” Naval Research Logistics Quarterly, John Wiley & Sons, Vol. 29(2), pp. 323-330, June 1982

Félix Belzunce, Carolina Martínez-Riquelme, José A. Mercader, José M. Ruiz, “Comparisons of Policies Based on Relevation And Replacement by a New One Unit in Reliability,” TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer, Sociedad de Estadística e Investigación Operativa, Vol. 30(1), pp. 211-227, March 2021

David R. Cox,. “Regression Models and Life-Tables,” Journal of the Royal Statistical Society, Series B, 34(2): pp. 187–220, JSTOR 2985181.MR 0341758, 1972

Francis A. Favara, “Guidance Material for Aircraft Engine Life-Limited Parts Requirements,” US DOT FAA, Advisory Circular, July 2009

André-Michel Ferrari, “Life Models for Repairable Versus Non-repairable Assets,” Weekly Update, The Reliability Mindset Archives – Accendo Reliability, July 2023

L. L. George and Yat H Lo, “An Opportunistic Look Ahead Replacement Policy,” Annals of SOLE, Vol. 14, No. 4, Winter 1980

L. L. George, “Renewal Process Estimation Without Life Data,” Weekly Update, Renewal Process Estimation, Without Life Data – Accendo Reliability, Sept. 2021

L. L. George, “Why Didn’t You Ask Before Running All Those Tests?” Weekly Update, Why didn’t you ask before running all those tests? – Accendo Reliability, Jan. 2022

L. L. George, “Poll: Is Life Data Required?” Weekly Update, Poll: “Is life data required…?” – Accendo Reliability, July 2023

Ilya B. Gertsbakh, Models of Preventive Maintenance, Vol 23., Studies in Mathematical and Managerial Economics, North-Holland, 1977

Kaminskiy, M. P. and Krivtsov, V. V. “A Monte Carlo Approach to Repairable System Reliability Analysis,” Probabilistic Safety Assessment and Management, London: Springer–Verlag, pp. 1063–1068, 1998

Krakowski, M., “The Relevation Transform and a Generalization of the Gamma Distribution Function,” Revue Francaise d’Automatique, Informatique et Recherche Opérationnelle, vol. 7, Séries V-2, pp. 107–120, 1973

Fred Schenkelberg, “The Next Step in Your Data Analysis,” Weekly Update, The Next Step in Your Data Analysis – Accendo Reliability/

Peña, E.A., Strawderman, R. and Hollander, M. Nonparametric Estimation with Recurrent Event Data. J Amer. Statist. Assn., Vol. 96, pp. 1299-1315, 2001

Nikos Yannaros, “Weibull Renewal Processes,” Ann. Inst. Statist. Math, Vol. 46, No. 4, pp. 641-648, 1994