Renewal Process Estimation, Without Life Data

At my job interview, the new product development director, an econometrician, explained that he tried to forecast auto parts’ sales using regression. His model was

sales forecast = SUM[b(s)*n(t-s)] + noise; s=1,2,…,t,

where b(s) are regression coefficients to be estimated, n(t-s) are counts of vehicles of age t-s in the neighborhood of auto parts stores. The director admitted to regression analysis problems, because of autocorrelation among the n(t-s) vehicle counts, no pun intended.

I learned actuarial methods working for Air Force Logistics Command. His sales-forecast is an actuarial forecast, and the b(s) values are actuarial rates, values of a discrete failure rate function. I derived methods to estimate actuarial rates from vehicle registration data, past auto parts’ sales, and bills of materials that tell which parts and how many were used by which vehicle, using gozinto theory.

He asked, “What if parts could be replaced more than once? Your estimation methods are for dead-forever parts with at most one failure.” Chagrined: over the weekend I wrote a program to derive least-squares estimates of time-between-failures distribution for renewal processes.

I got the job, as a regular employee, so the company owned the rights to trade secret estimation method. The company even patented the actuarial forecast as part of U.S. patent number 5,765,143! The company made millions of dollars per year from forecasts that were evidently less biased and more precise than previous alternatives and from stock level recommendations that provided specified fill rates. The company was bought by a competitor in a leveraged buyout (Texas Firms Buy Livermore’s Triad for $220 Million (sfgate.com)). The new product development director was told to do as much as possible at as little cost as possible, so he laid me off. He was later laid off too. I wrote the current new products manager of the company, now www.epicor.com, that I had corrected an error in software that would improve the forecasts and would share the correction with the company. No reply. Epicor later paid www.smartcorp.com to resell naïve forecasts and SmartCorp’s time-series bootstrap to represent uncertainty [Willemain et al.].

Review of times-between-failures distribution estimates

This article describes reliability function estimation from grouped ships and returns counts from the superposition of renewal processes, without knowing how many renewals have occurred. Traditional reliability estimators require age-at-failure data. Few industries track parts by serial number to obtain ages at failures, even a sample, because of cost and privacy restrictions.

Table 1 shows nonparametric reliability estimators. The Kaplan-Meier (K-M) nonparametric maximum likelihood estimator of reliability is for censored age-at-failures, in which either failures stay dead or ages between renewals and survivors’ ages are known.

Failures	Data	Max. Likelihood	Least Squares
Dead forever, known ages-at-failures	Censored ages-at-failures	[Kaplan and Meier]	[Laplace?]
Renewal process, known ages-at-failures	Censored ages-at-failures	[Kaplan and Meier]	[?]
Dead forever, unknown ages-at-failures	Calendar interval ships and returns counts	[George 1993 and 1999]	[Harris and Rattner], [Oscarsson and Hallberg], and [George 1993]
Renewal process, unknown ages-at-failures	Calendar interval ships and returns counts	[Tortorella] and [George, 2002]	[George 1995]

Table 1. Alternative nonparametric reliability estimation methods

If you have Inter-renewal times or ages-at-failures data, estimation is easy, even without assuming a distribution. Pena, Strawderman, and Hollander tell all you need to know. If you have grouped data such as renewal counts for each unit, Denby and Vardi provide a shortcut to the nonparametric Kaplan-Meier estimator. Frees and Schneider and O’Cinneide estimate the renewal function as SUM[F^(k)(t)] where F^(k)(t) is the convolution of k distribution functions F(t). Or you could dump all the counts data into an AI program and hope to find a model that fits the data [Rosienkiewicz].

What if numbers of previous renewals are unknown?

Suppose you didn’t have the times-between-failures or even the numbers of renewal counts for units under observation? What if the data is the superposition of renewals from staggered-start cohorts as in the automotive aftermarket? Life data is sufficient but not necessary. What cost life data? Wouldn’t you prefer reality? Recurrent processes? Repairable systems? Lifetime data are sufficient for nonparametric estimators, but are not necessary.

Dattero proposed an estimator assuming age-interval counts were limited to 0 or 1, based on forward recurrence density. Guerdon and Cocozza-Thivent proposed using the EM (Estimation-Maximization) algorithm. Mike Tortorella published a nonparametric estimator from grouped data for Lucent renewable part reliability, but his estimator was wrong. At least Mike published his data. Mike later asked me for engine reliability of his Mercedes V-8 from production and piston-ring set sales data.

Figure 1. Comparison of Lucent renewable part reliability estimates. (Legend “spmle” stands for semi-parametric maximum likelihood, because the maximum likelihood estimator is for input to and output from a sequence of M(t)/G/∞self-service queuing systems, because output from an M(t)/G/∞ system has Poisson distribution (M(t)). It’s as if outputs (failures) were recycled.)

Nonparametric Least Squares minimizes SUM(Observed returns – E[returns])², where E[returns] = SUM[N(k–j)*d(j)] (actuarial forecast with age-specific demand rates d(t)), N(k–j) is ships j periods ago (cohort), d(j) = M(j)–M(j–1), the discrete renewal density, and the “renewal function” M(j) = SUM[F^*k(t)], k = 1,2,…¥, the sum of convolutions of the distribution function F(t) of times between failures.

The M88-A1 is essentially a tank tow-truck. (Medium Recovery Vehicle M88 – Tanks Encyclopedia (tanks-encyclopedia.com)/) It originally had a four-stroke gas engine which limited its range 201 miles (less than 0.5 MPG). It was re-engined with a huge V12, air-cooled, twin-turbocharged diesel engine. Jay Leno has one of those engines in a car. M88-A1 data is installed base by age and rebuilds of engine and power train components in 1990s. Engine Rebuilds cost $116,000!

Figure 2. M88-A1 AVDS 1760 failure rate function estimate of engine times-between-rebuilds

Figure 2 shows the M88-A1 AVDS 1790-6A engine failure rate function estimate. Note the probability of failure in the first year is 0.13, perhaps more than once. Barnes and Reinecke Engineering (inventor of the pop-up toaster), the Army TACOM, and RAND Corp. ignored this information. There was a shortage of these engines during the Gulf wars. No surprise.

Data for figure 3, were Ford’s 1988 V8 460cid engine monthly warranty repairs/1000 and Wards’ Automotive vehicle sales data A History Of The Ford 460, The Blue Oval’s Longest-Lasting Truck Big Block V8 | DrivingLine. When I sent the results to Ron Salzman, Ford, he provided actual engine production counts in return for better failure rate function estimates. An interpretation of figure 3 is that 17% return almost immediately (upon arrival at dealer), 10% probably upon sale, and repairs fail 15% of the time. This was the last carbureted Ford engine: Ford used fuel injection thereafter. Dwight Jennings, retired Apple statistician, had a 1988 Ford V-8 460cid truck. It required so many repairs that he sold it when warranty expired.

Figure 3 shows distributions of a modified renewal function model. Time to first repair had a different distribution from times between subsequent repairs. This shows lemon probability. When I showed figure 3 at RAMS 2000, Vasily Krivtsov, Ford, became excited and claimed something was wrong with my model and analysis. Krivtsov was doing his thesis on “A Markov Chain approach to modeling and estimation of the generalized renewal process and repairable systems reliability analysis,” at University of Maryland. His generalization models hysterecal repair (repair plus rejuvenation). I modeled time to first repair and times between repairs; times between 1^st and 2^nd, 2^nd and 3^rd had about the same distribution estimates.

Figure 3. Ford 1988 V8 460cid engine time-between-repairs rate function estimates. “p(t;0)” is for time to first engine repair and “p(t)” is for times between subsequent repairs.

What is the distribution of demand?

Demand is the actuarial forecast, Sd(s)*n(t-s), where d(s) are age-specific demand rate estimates and therefore random. The installed base n(t-s) should be known; get it from production or sales data required by GAAP. There are several ways to quantify the demand variability: bootstrap, [e.g., Willemain et al.], and martingale central limit theorem (You searched for martingale – Accendo Reliability) . According to Aalen and Husebye, the demand rate estimates have asymptotically multivariate normal distribution with computable variance-covariance matrix. However, the nonparametric demand rate estimates from ships and returns counts are correlated. Therefore, the actuarial forecast has normal distribution with VAR[Sd(s)*n(t-s)] = SVAR[d(s)]*n(t-s)²+2SSCOVAR[d(t-s),d(s)]*n(s)*n(t-s) summed over s£t.

R-Script program

I wrote R-script to estimate inter-renewal time distribution for renewal processes without renewal counts by least squares. The R-script doesn’t deal with inadmissible data, yet. R-script results compare with spreadsheet-VBA-Solver results. R-script for renewal processes results compare with alternative spreadsheet-Solver- constrained nonparametric least squares estimators.

Instructions are similar for other R-scripts. Load R-script RenM88A2.R into R, R-Studio, or… Inputs: Set working directory to folder containing input file. Prepare input file to look like M88A1.csv including column headings, or else you’ll have to change R-script names. Change read.csv statement to point to your input file or modify for another format. Change starting pdf in optimx(rep(pdf),…) if you want. Choose method or all.methods. Run R-Script

Output is the inter-renewal-time probability density function estimate, objective function value, evaluations, iterations, convergence code, local optimality indicators, and execution time. Copy optimal pdf values and objective value to a graph or for use elsewhere.

If you send data and describe it, we will plug it into an Excel workbook with VBA or modify R-scripts if necessary to accommodate your data. Translate product ships, parts’ return counts, and BoMs into parts’ ships and returns using Gozinto theory. We will send back your data input, R-script, and results.

References

Aalen, O. O. and Husebye E. 1991, “Statistical Analysis of Repeated Events Forming Renewal Processes”, Statistics in Medicine, 10:1227–1240. DOI:10.1002/SIM.4780100806

Dattero, Ronald, 1989, “Non-parametric estimation for renewal processes from event count data,” Applied Stochastic Models and Data Analysis, Vol. 5, pp. 1-12

Frees, Edward, 1986, “Nonparametric renewal function estimation,” The Annals of Statistics, Vol. 14, No. 4, pp. 1366-1378

George, L. L. “Renewal Distribution Estimation Without Renewal Counts,” INFORMS, San Jose, Nov. 2002

George, L. L. 2004, “Actuarial Forecasts for the Automotive Aftermarket,” Transactions Journal of Materials and Manufacturing, SAE, Vol. 5, pp. 697-701

Harris, Carl M. and Edward Rattner 1997, “Estimating and projecting regional HIV/AIDS cases and costs, 1990-2000: A case study,” Interfaces, Vol. 27, No. 5, pp. 38-53

O’Cinneide, Colm 1991, “Identifiability of superpositions of renewal processes,” Stochastic Models, Vol. 7, pp. 603-614

Oscarsson, Patric and Örjan Hallberg 2000, “ERIVIEW 2000 – A Tool for the Analysis of Field Statistics,” Ericsson Telecom AB, http://home.swipnet.se/~w‑78067/ERI2000.PDF

Pena, E. A., Strawderman. R. L., and Hollander, M. 2001, “Nonparametric estimation with recurrent event data,” Journal of the American Statistical Association, Vol. 96, pp. 1299-1315

Guédon, Yann, and Christiane Cocozza-Thivent 2003, “Nonparametric Estimation of Renewal Processes from Count Data.” The Canadian Journal of Statistics / La Revue Canadienne De Statistique, vol. 31, no. 2, pp. 191–223

Rosienkiewicz, Maria 2013, “Artificial Intelligence Methods in Spare Parts Demand Forecasting.” Logistics and Transport,No. 2(18)

Tortorella, Michael 1996, “Life estimation from pooled discrete renewal counts,” Lifetime Data: Models in Reliability and Survival Analysis, N. P. Jewell et al. (eds.), pp. 331-338, Kluwer, The Netherlands

Vardi, Y. 1982, “Nonparametric estimation in renewal processes,” The Annals of Statistics, Vol. 10, No. 3, pp. 772-785

Wang, Mei-Cheng and Shu-Hui Chang 1999, “Nonparametric Estimation of a Recurrent Survival Function”, J Am Stat Assoc. 94(445), pp. 146–153Willemain, Thomas R., Charles N. Smart, Henry F. Schwarz 2004, “A new approach to forecasting intermittent demand for service parts inventories.” International Journal of Forecasting, 20, pp. 375–387

Comments

Larry George says
September 26, 2021 at 2:31 PM
Sorry, Greek summation symbol didn’t show in the article:
S b(s)*n(t-s) should be SUM[b(s)*n(t-s); s=1,2,…,t], (That’s an actuarial forecast.)
Renewal function M(t)=SF(k)(t) should be SUM[F(k)(t), k=,1.2,…] where k is the convolution index counter, and “(k)” should have been superscript.
S[Observed-Expected]^2 should be SUM[Observed-Expected]^2 over all cohorts, and
Variance of actuarial forecast VAR[Sd(s)*n(t-s)] should equal
SUM[VAR[d(s)]*n(t-s)^2]+2*SUM[SUM[COVAR[d(s),d(t-s)]*n(s)*n(t-s)]; for s<t; s and t =1,2,…,t] where d(s) are estimates of actuarial demand rates, conditional on being alive at age s regardless of previous renewals if any.