Actuarial Forecasts, Least Squares Reliability, and Martingales

I learned actuarial methods working for the USAF Logistics Command. We used actuarial rates to forecast demands and recommend stock levels for expensive engines tracked by serial number, hours, and cycles. I had a hunch that actuarial methods could be applied to all service parts, without life data.

People say, “You have to have life data to estimate reliability functions.” I collect such claims. Seen any lately? Please send them to pstlarry@yahoo.com.

I made nonparametric maximum likelihood estimates of reliability functions, without life data. For example, a computer’s life could be represented as service time in an M/G/infinity self-service system with life distribution G(t). Given M/G/infinity input and output counts, computer “ships” (cohorts) and “returns”, maximum likelihood estimation works.

Later Dick Mensing (real statistician, Statistics in Research: Basic Concepts and Techniques for Research Workers – Bernard Ostle, Richard W. Mensing – Google Books) suggested, “Why don’t you use least squares?” That is, regard ships cohorts X (table 1) and returns Y as Y = Xb+e, where b is a row vector of the probability density function (pdf) g(s) or actuarial rates a(s) corresponding to G(s), and e is an error vector. SUM[Xb]= n(3)a(1)+n(2)a(2)+n(1)a(3)… is a “hindcast” of returns, an actuarial “forecast” of observed returns. This is a “general” linear model because the e(i) and e(j) could be correlated.

Table 1. X matrix of ships cohorts n(t) and b-vector of regression coefficients.

Period	1	2	3
1	0	0	n(3)
2	0	n(2)	0
3	n(1)	0	0
pdf or actuarial rates	g(1) or a(1)	g(2) or a(2)	g(3) or a(3)

The least-squares estimator of the lifetime pdf g(s) or actuarial rates a(s) minimizes SUM[Observed Returns(t) – Expected Returns(t))², summed over all cohorts, where Expected Returns(t) is SUM[g(s)*n(t-s)] or SUM[a(s)*n(t-s)], the actuarial “hindcast” of past returns. Dan Moore (real biostatistician) suggested dividing summands by Expected Returns(t), as in the chi-square statistic. Dick and Dan are real statisticians, friends from my time at Lawrence Livermore Lab.

What’s the difference between failure probabilities and actuarial rates?

The pdf g(t) corresponding to the lifetime distribution G(t) is the unconditional probability of failure at age t. Actuarial rates a(s) are ->conditional<- failure rates, conditional on survival to age t. For examples:

I used published industry Apple computer monthly sales to compute parts’ installed base by age. There was no way to know whether the computers had survived since sale. So I estimated the pdf g(t) in reliability estimation for computer reliabilities R(t) = 1-G(t))).
For the automotive aftermarket, we bought ->current<- vehicle registration counts by year, make, model, and engine. Those cars and their parts were still registered, so I estimated actuarial rates a(s) and reliabilities R(t) = 1-exp[SUM{a(s); s = 1,2,…,t].

Don’t take my word for it. Figure 1 is an example of G(t) estimates from simulated Weibull returns with scale parameter 10 and shape parameter 0.75. Maximum likelihood and least squares estimates differ a little. Figure 2 shows the differences between pdf g(s) and actuarial rate a(s) estimates for maximum likelihood and least squares. Actuarial rates are larger, because cohorts are of survivors are fewer.

Figure 1. Weibull and nonparametric distribution estimates from ships and simulated Weibull returns counts

Figure 2. Pdf g(s) and actuarial rate function a(s) estimates for same simulated data as in figure 1.

Least Squares Estimation Method

At first, I used the least squares minimization algorithm from www.numerical.recipes. It worked, if the initial values were close to the optimal solution. Word of mouth said Dave Fylstra, also working at Apple, offered a beta-test version of his brother Dan Fylstra’s Excel Solver [www.FrontlineSystems.com]. Solver is a global optimization program that looks around for better solutions. Dan got the “multi-start” idea from Leon Lasden, University of Texas. I still use Solver.

Least squares comes in handy for nonstationary, non-ergodic processes, and recurrent processes too, such as survival function estimation for COVID-19 surges.

Properties of Least Squares Reliability Estimators

The Gauss–Markov theorem – Wikipedia says “…the ordinary least squares estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear model are uncorrelated, have equal variances and expectation value of zero.”

The least squares reliability estimator is asymptotically unbiased. You could check errors e(t) = observed(t)-expected(t) for correlation and homoskedasticity (equal variance). The condition on correlation was violated in automotive after-market, because of “autocorrelation” of vehicle installed base from year to year (no pun intended).Failure to satisfy conditions for the Gauss-Markov theorem does not mean all is lost.

Caution: Neuro-Hazard!

Martingales (picture – also used to limit a horse’s ability to raise its head) are used in statistics to prove asymptotic properties of least squares estimators [Aalen et al.]. Odd Aalen says, “…counting process minus the integrated intensity process is a martingale.” A reliability martingale counts cumulative returns minus the integral of an intensity or failure rate function. The expectation of a martingale is zero conditional on the past, where past data is used to estimate or model the intensity or failure rate function.

The least squares reliability estimator minimizes
(SUM[observed(t)] – SUM[expected(t)])², a martingale squared. The martingale central limit theorem says that the estimators have asymptotically normal distribution around the pdf or actuarial rates. The martingale central limit theorem for reliability estimation could be derived in the same ways as for the Kaplan-Meier and Nelson-Aalen reliability estimators [Aalen et al.].

Please see Random-Tandem Queues and Reliability Estimation, WIthout Life Data – Field Reliability (google.com) for evaluation of alternative estimators, without life data. The Weibull simulation workbook WeiSimSR.xlsm for figures 1 and 2 is in the List of Files at Field Reliability (google.com).

References

Odd O. Aalen, Per Kragh Andersen, Ørnulf Borgan, Richard D. Gill, and Niels Keiding, “History of Applications of Martingales In Survival Analysis,” Electronic Journal for History of Probability and Statistics, Vol. 5, Nr. 1, June 2009 (www.jehps.net), arXiv:1003.0188 [stat.ME]

What’s the difference between failure probabilities and actuarial rates?

Least Squares Estimation Method

Properties of Least Squares Reliability Estimators

Caution: Neuro-Hazard!

References

About Larry George

Leave a Reply Cancel reply