Covariance of Renewal Process Reliability Function Estimates Without Life Data?

Email from www.smartcorp.com advertised how to forecast inventory requirements using time-series analyses: single and double exponential smoothing, linear and simple moving average, and Winters models. SmartCorp compares alternative times-series forecasts in a “tournament” that picks the best forecast. Charles Smart says forecasting, “…particularly for low-demand items like service and spare parts — is especially difficult to predict with any accuracy.”

Time series forecasts also quantify variance. Excel’s time-series FORECAST() functions do exponential smoothing, account for seasonality and trend, and “pointwise” confidence intervals. Pointwise means only one confidence interval is valid at a time; not a confidence band on several forecasts!

Similar time-series forecasts were used at Apple Computer 33 years ago along with ARMA methods [Box and Jenkins]. The same forecast method won every tournament for every Apple service part. The forecasts did not agree with actuarial forecasts estimated from ships and returns counts (renewals or replacements) [https://accendoreliability.com/forecast-parts-demands-without-life-data-for-a-nonstationary-process/]. The Apple time-series forecast was a point value, not the distributions of parts’ demands. I needed the demand distributions to recommend inventory levels to meet service level, backorder, and fill rate requirements.

I also made actuarial forecasts and spare parts stock recommendations for Triad Systems Corporation for the automotive aftermarket [Sheldon et al., US Patent 5765143]. Auto parts’ demands are renewal processes. Triad made so much money off those forecasts and stock recommendations that a competitor bought the company in a leveraged buyout. That company is now www.epicor.com. Guess who does the inventory forecasting for Epicor? SmartCorp!

Time series forecasting is like driving while looking backward. It’s fine for commodities and products that have no apparent factors driving sales except past sales. Some products’ and service parts’ demands are driven by reliability. That’s why the actuarial (age-specific) failure rate function is called the “Force of Mortality”. Actuarial forecasting is like driving while looking forward. It accounts for the force of mortality with age-dependent failure or demand rates. It accounts for sales forecasts, lead times, and attrition. What are forecasts of the distribution functions of service parts’ demands during lead time?

I used to assume Poisson demand distributions with rates equal to the actuarial forecasts, because that’s what the US Air Force Logistics Command assumes for engines. They learned that from RAND Corporation in the 1960s. I made actuarial forecasts and recommended stock levels for all Apple service parts, based parts’ actuarial failure rate estimates and on dealers’ installed base (sales by age). Have I learned anything since then?

Actuarial Forecasts

An actuarial forecast is ∑[d(t-s)*n(s)], s=1,2,…,t where d(t-s) is the actuarial demand rate for age t-s and n(s) is the installed base of age s. The actuarial demand rate d(t-s) is the same as actuarial failure rate, for products or parts that stay dead when they fail, but not for products or parts that are repaired, renewed, or replaced. The variance of the actuarial demand forecast is ∑Var(d(t‑s))*n(s)²+∑∑Covar(d(t‑s),d(t))*n(s)*n(t-s).

An earlier article showed how to compute least-squares, nonparametric reliability function estimates for the 1988 Ford V-8 460 cid engines that required repeated service; they were the last Ford engines with carburetors [https://accendoreliability.com/renewal-process-estimation-without-life-data/].

The article on martingales [https://accendoreliability.com/actuarial-forecasts-least-squares-reliability-martingales/] showed parts’ demand forecasts were asymptotically normally distributed with mean equal to the actuarial forecast. But with what variance[d(t-s)]? With what covariance[d(t1), d(t2)]?

It is the variance-covariance matrix of least-squares, nonparametric distribution function estimates of time between-renewals-distributions. It’s easy to compute if you have counts data that tell how many renewals have occurred for each unit in each cohort in each age-interval [Guédon and Cocozza-Thivent]. What if all you have is periodic ships (cohort sizes) and renewal counts, data required by GAAP?

The Greenwood asymptotic bound on the variance of the Kaplan-Meier reliability estimator can be badly biased, depending on the data. So too are the asymptotic Cramer-Rao covariance bounds. For ships and renewal counts data, I use the variance-covariance matrix of the actuarial return rates from:

Bootstrap samples of the nonparametric least-squares reliability estimator from ships and period renewal counts (without knowing numbers of prior renewals), and
Cohort reliability estimates from each ships or sales cohort and its renewal counts.

The bootstrap and cohort estimators worked well for the Kaplan-Meier estimator from grouped returns that were dead-forever, but they don’t work for renewal process counts, because the Kaplan-Meier estimator is for products that stay dead if they fail!

This article compares, for renewal processes, the two data-driven alternatives to the Greenwood variance and Cramer-Rao covariance bounds: bootstrap and cohort variance-covariance matrix estimates.

Bootstrap Simulation-Estimation of Reliability Variance-Covariance Matrix

SmartCorp bootstraps time-series demand forecasts [Willemain and Smart, US Patent-6205431-B1]. I bootstrap variance-covariance matrixes from simulated samples from the estimated cumulative distribution function of times-between renewals. It is convenient to simulate a Nevada table of, period renewal counts from cohorts shipped each period (table 1). Don’t use the Kaplan-Meier estimator on this simulated renewal process data! The Kaplan-Meier estimator is for grouped failure counts that are never renewed or replaced, with at most only one failure per cohort member. A VBA user function simulates renewal counts in each cell of the Nevada table from the least-squares reliability estimator computed from the original data. Use the nonparametric least-squares distribution function estimator from cohort sizes and sums of each period’s simulated returns or renewals (bottom row of table 1).

The sums of the columns of the Nevada table are the “superpositions” of renewals from each cohort for each period. The cohort sizes and the bottom row of table 1 contains statistically sufficient data to estimate the distribution function of times between renewals. I.e., by recalculating the spreadsheet, the Nevada table is resampled from the estimated distribution of times between renewals, and then I re-estimate the distribution of times between renewals and compute their variance-covariance matrix of the bootstrap distributions as many times as I can stand.

Period renewal counts may include the first, second, or ??? renewal. There’s no way to tell how many renewals occurred unless you track individual products or parts by name or serial number between renewals. At least three Oracle-based service databases do not distinguish whether a failure event was the first failure of a second part with the same name or the second failure of the first part with the same name. There are methods to deal with “masked” data [Rodrigues et al., Wu].

People have observed that the sums in the last row are superpositions of renewal processes but are not a renewal process. That is irrelevant for this article, because the bootstrap simulates the superpositions by convolutions of the distribution of times between renewals. The superposition is a Visual Basic user function: basically
If Sum(t)+X>=t And Sum(t)+X < t+1 then Count(t) = Count(t)+1,
where X is a random time between renewals generated from the original, least-squares distribution to be estimated.

Table 1. Simulated renewal counts based on installed base (“ships and returns”) and estimated reliability function from the original grouped returns data for 18 periods.

Period	Ships	Period 1	Period 2	Period 3	Period 4	Period 5	Etc.
1	66	0	0	0	1	0	0
2	85		0	0	0	4	0
3	25			0	0	0	0
4	79				0	0	0
5	24					0	0
6	17						0
Etc.
Sums	911	0	0	0	1	4	0

Table 2. Original cumulative distribution function estimate (CDF), some of eight bootstrap distribution estimates, means, standard deviations, and their coefficients of variation (CV) for 18 periods

Period	Orig. CDF	Boot 1	Boot 2	Etc.	Mean	Stdev	CV
1	0	0	0		2E-04	0.0004	2.65
2	0	0	0		0.002	0.0028	1.54
3	0.025	0.0012	0		0.004	0.0043	1.20
4	0.025	0.0066	0.0084		0.018	0.0094	0.53
5	0.025	0.0074	0.0266		0.02	0.0092	0.45
6	0.025	0.0196	0.0266		0.022	0.0078	0.35
Etc.

Cohort Estimation of Variance-Covariance Matrix

Each cohort provides an independent sample of grouped period renewal counts of different lengths. For each cohort, use the least squares estimator to estimate the distribution of times between renewals. Compute their variance-covariance matrix. Look for outlier cohorts and remove them. Figure 1 broom chart shows cohort distribution estimates. They did not agree. I removed several cohorts, and the mean and standard deviation estimates of reliability at each age agreed tolerably with the bootstrap means and standard deviations (figure 2).

Figure 2 shows the original cumulative distribution function estimate and the bootstrap and cohort average estimates. They agree pretty well. Data is arranged similarly to table 3.

I tricked Excel into plotting a step function broom chart. Excel connects points in a plot with sloped lines. Nonparametric cumulative distribution function (CDF) estimates are constant between jumps. Several step-function graphic tricks on the Internet didn’t work for me, so I listed two successive data values for each age: current and next value, tediously, carefully (table 3).

Table 3. Arrangement of data for figure 1 broom chart; double entries for each age.

Age, Periods	Original CDF	Cohort 1	Cohort 2	Cohort 3
1	0.0000	0.0021	0.0022	0
2	0.0000	0.0021	0.0022	0
2	0.0000	0.0026	0.0047	0
3	0.0000	0.0026	0.0047	0
3	0.0251	0.0066	0.009	0
4	0.0251	0.0066	0.009	0
4	0.0251	0.0073	0.0339	0.0243
Etc.

Figure 1. Cohort broom chart. Longer lines are from older cohorts. Bold line is original least-squares cumulative distribution function (CDF) estimated from all cohorts.

Table 4. Original cumulative distribution function estimate (CDF), some of 8 cohort distribution estimates, and their means, standard deviations, and coefficients of variation (CV)

Age, Period	Orig. CDF	Cohort 1	Cohort 2	Etc.	Mean	Stdev	CV
1	0	0	0.002		2E-04	0.0004	2.65
2	0	0	0.003		0.002	0.0028	1.54
3	0.025	0	0.007		0.004	0.0043	1.20
4	0.025	0.029	0.007		0.018	0.0094	0.53
5	0.025	0.029	0.007		0.02	0.0092	0.45
6	0.025	0.029	0.007		0.022	0.0078	0.35
Etc.

Figure 2. Compare nonparametric least-squares vs. bootstrap and cohort cumulative distribution function (CDF) estimate means. Bold line is the original least-squares CDF estimated from all cohorts.

Estimation of Variance-Covariance Matrix of Actuarial Demand Rates

I am still trying to derive the Cramer-Rao lower bound on the variance-covariance matrix of the maximum likelihood reliability function estimator. The likelihood function is ∏λG(t)^r1(t)*G^*2(t)^r2(t) etc., where rj(t) is the number of j-th returns in period t and G^*j(t) is the convolution of the time-between-renewals distribution. Some publications assume nonstationary Poisson processes [“Cox Process”, Yannaros] for the superposition of renewal counts [Shaomin, Tian]. For good reason; the output of a nonstationary Poisson input M(t)/G/infinity self-service system has Poisson[λG(t)] output distribution, where G(t) is a convolution of the distribution of times-between renewals.

The variance-covariance of the cumulative distribution function of times between renewals is needed for confidence bands. But that’s not what I need for the distribution of an actuarial demand forecast ∑d(t‑s)*n(s); s=1,2,…,t. I need the variance-covariance matrix of the actuarial demand rates d(t-s). I get them by plugging the cumulative distribution function estimates from bootstraps or from cohorts into the convolution formulas for the renewal function M(t) and d(t)= M(t+1)-M(t) where M(t) (aka “renewal function”) is the expected number of renewals by age t and d(t) is the actuarial demand rate per unit installed per age unit.

Compare tables 5 (bootstrap) and table 6 (cohort). Cherry-picking eliminated some outlying cohorts apparent in broom chart figure 1. (I used cohorts of sizes 18, 17,…,9, and 6). That reduced the variance-covariance magnitudes, especially for the older ages with fewer cohorts. Nevertheless, I recommend using the bootstrap variance-covariance matrix estimate, because there was so much variability in the cohorts’ time-between-renewals distribution function estimates.

Table 5. Bootstrap variance-covariance matrix estimates from times-between-renewals cumulative distribution function estimates. Variances are on the diagonal.

Age	1	2	3	4	5	6	Etc.
1	1.91E-07						0
2	-2.70E-07	8.03E-06					0
3	3.14E-08	5.81E-07	9.52E-06				0
4	1.66E-06	6.11E-06	-1.55E-05	8.51E-05			0
5	-1.98E-07	-4.03E-06	-4.26E-06	-1.04E-05	3.38E-05		0
6	-2.62E-07	-2.11E-06	-9.05E-07	-1.28E-05	-3.16E-06	1.61E-05	0
Etc.

Table 6. Cohort variance-covariance matrix estimates from all cohort times-between-renewals cumulative distribution function estimates

Age	1	2	3	4	5	6	Etc.
1	6.80E-07						0
2	1.44E-09	3.12E-12					0
3	4.51E-07	9.54E-10	5.01E-07				0
4	1.95E-09	4.23E-12	2.15E-09	9.42E-12			0
5	1.33E-06	2.75E-09	9.17E-07	3.86E-09	2.69E-06		0
6	6.63E-09	1.43E-11	5.12E-09	2.25E-11	1.29E-08	6.83E-11	0
Etc.

What I learned?

This article shows estimates of the variance-covariance matrix of least-squares time-between-renewals(approximately normal) cumulative distribution function. I recommend the bootstrap, or else estimate the distribution for each cohort and compare their variance-covariance matrix estimates. The variance-covariance matrix is needed for computing the (approximately normal) distribution of demand forecasts for setting inventory levels and dealing with service levels, back orders, and fill rate requirements. Would you like to do that? The alternative is time-series forecasting, coping with intermittent demands, ignoring the force of mortality, and dealing with unexpected backorders.

References

Box, George and Gwilym Jenkins, Time Series Analysis: Forecasting and Control, San Francisco: Holden-Day, 1970

George, L. L., “Actuarial Forecasts, Least Squares Reliability, and Martingales,” Weekly Update, https://accendoreliability.com/actuarial-forecasts-least-squares-reliability-martingales/, April 2021

…“Renewal Process Estimation Without Life Data,” Weekly Update, https://accendoreliability.com/renewal-process-estimation-without-life-data/, Sept. 2021

…“Forecast Parts’ Demands, Without Life Data, for a Nonstationary Process,” Weekly Update, https://accendoreliability.com/forecast-parts-demands-without-life-data-for-a-nonstationary-process/, June 2022

…“Variance of the Kaplan-Meier Estimator?” Weekly Update, https://accendoreliability.com/variance-of-the-kaplan-meier-estimator/, Feb. 2023

Yann Guédon and Christiane Cocozza-Thivent, “Nonparametric Estimation of Renewal Processes from Count Data,” Canadian Journal of Statistics, Wiley, Vol. 31 (2), pp.191-223. ff10.2307/3316067. HAL-00827464, 2003

Agatha Rodrigues, Pascal Kerschke, Carlos Alberto de B. Pereira , Heike Trautmann , Carolin Wagner, Bernd Hellingrath, Adriano Polpo, “Estimation of Component Reliability from Superposed Renewal Processes with Masked Cause of Failure by Means of Latent Variables,” arXiv:1807.012690v2 [stat.AP] May 2019

Sheldon, David, James Leach, and Vladimir Pisarsky, “Method and System for Inventory Management,” US Patent 5765143, June 1998

Smart, Charles, “Probabilistic Forecasting for Intermittent Demand,” https://smartcorp.com/blog/probabilistic-forecast-intermittent-demand/, Jan. 2023

Tian, Ye, “Estimating a Parametric Lifetime Distribution from Superimposed Renewal Process Data,” Thesis, 2013

Willemain, Thomas R., Charles N. Smart, and Henry F. Schwarz, “A New Approach to Forecasting Intermittent Demand for Service Parts Inventories.” International Journal of Forecasting, Vol. 20, pp. 375–387, 2004

Willemain, Thomas R. and Charles N. Smart, “System and Method for Forecasting Intermittent Demand,” US Patent-6205431-B1, March 2001

Wu Shaomin. “Lower and Upper Bounds of the Superposition of Renewal Processes and Extensions,” arXiv:2208.11043v1 [stat.AP], Aug. 2022

Nikos Yannaros, “Weibull Renewal Processes,” Ann. Inst. Statist. Math., Vol. 46, No. 4, pp. 641-648, 1994

Comments

Larry George says
April 15, 2023 at 10:25 AM
“Epicor has contracted with Predii, an enterprise AI software company (https://www.predii.com/about/) to apply AI…to capture part and service insights from vehicle repair events.”
This news came from https://www.epicor.com/en-us/newsroom/news-releases/epicor-expands-aftermarket-data-factory-to-harvest-ai-derived-insights-from-vehicle-repair-events/, 2021 See also…
“Epicor Automotive Aftermarket,” https://www.linkedin.com/posts/epicor-automotive-aftermarket_iotswc22-ai-leader-activity-6918250625759657984-c9-T/, April 2022
“Production Management,” [https://www.epicor.com/en/industry-productivity-solutions/modules/production-management/], April 2023