Uncertainty in Population Estimates?

Dick Mensing said, “Larry, you can’t give an estimate without some measure of its uncertainty!” For seismic risk analysis of nuclear power plants, we had plenty of multivariate earthquake stress data but paltry strength-at-failure data on safety-system components. So we surveyed “experts” for their opinions on strengths-at-failures distribution parameters and for the correlations between pairs of components’ strengths at failures.

If you make estimates from population field reliability data, do the estimates have uncertainty? If all the data were population lifetimes or ages-at-failures, estimates would have no sample uncertainty, perhaps measurement error. Estimates from population field reliability data have uncertainty because typically some population members haven’t failed. If field reliability data are from renewal or replacement processes, some replacements haven’t failed and earlier renewal or replacement counts may be unknown. Regardless, estimates from population data are better than estimates from a sample, even if the population data is ships and returns counts!

Walter Shewhart’s Rule #1

“Original data should be presented in a way that will preserve the evidence in the original data for all the predictions assumed to be useful.” Nonparametric field reliability estimates preserve all the evidence in field reliability data: lifetimes, censored or not, or ships (installed base, production, sales, etc.) and returns (complaints, repairs, replacement, failures, even spare parts’ sales) counts.

Censored lifetime data are statistically sufficient to make nonparametric maximum-likelihood reliability or failure rate function estimates (Kaplan-Meier, Nelson-Aalen). Periodic ships and returns counts are also statistically sufficient [George and Agrawal, Oscarsson and Hallberg]. Ships and returns counts are population data, and they in data required by GAAP (Generally Accepted Accounting Principles). You may have to work to convert periodic revenue and price data into ships and returns counts (ships = sales revenue/price) and returns could be spare parts’ sales revenue/price). Use gozinto theory and bills-of-materials to convert product installed base by age into parts’ installed base by age [https://accendoreliability.com/gozinto-theory-parts-installed-base/#more-417514].

Do the best you can with the data you have. Fred Schenkelberg wrote, “I said it before, only gather the data you need,” [https://insights.cermacademy.com/398-environmental-and-use-manual-fred-schenkelberg/].

Unknown Knowns? Did You Check Reliability Model Assumptions?

Did you assume constant failure rate and exponential reliability, Weibull, normal distribution, statistical independence, stationarity? What if you assume a reliability function and it’s not? Are your assumptions justified by physics or other structural evidence? Nonparametric estimates require no distribution assumptions and preserve all evidence in field reliability data.

Known Unknowns? Nonparametric Estimates Require Fewer Assumptions!

Nonparametric estimators from grouped, censored lifetime data require no unwarranted assumptions except independence and identically distributed ages at failures or censoring times (Kaplan-Meier, Nelson-Aalen). Those nonparametric estimators have known variance whether from sample or population data (Greenwood).

Nonparametric estimators from population ships and returns counts require an assumption, that ships counts are a nonstationary Poisson process [George]. The model for ships and returns counts is an M(t)/G/infinity self-service system. M(t) represents ships as nonstationary Poisson inputs, and G stands for the-service time distribution (1-reliability function). Output of an M(t)/G/infinity system is also a Poisson process [Mirasol]. That nonstationary Poisson input assumption of independent< Poisson increments is untestable. It’s impossible to test hypotheses with samples of size 1 (Poisson period ships). Nonparametric reliability or failure rate function estimates from population periodic ships and returns counts also have known variance and covariance.

Which would you prefer: lifetime data (grouped as in Table 1) or population ships and returns counts from data required by GAAP? It costs $$$ and time to track individual units by name or serial number to get lifetime data, even a sample.

Table 1. [http://www.weibull.com/hotwire/issue119/relbasics119.htm] Nevada table of monthly ships (2^ndcolumn) and grouped returns counts for input to Kaplan-Meier reliability function estimator. (It looks like Nevada turned on its side.)

Week	Ships	Mar-10	Apr-10	May-10	Jun-10	Jul-10
Mar-10	1623	0	1	3	5	7
Apr-10	3723		0	2	7	11
May-10	1319			0	0	3
Jun-10	3600				0	2
Jul-10	3298					0
Etc.
Returns		0	1	5	12	23

The ships column and returns in the bottom row of table 1 contain statistically sufficient data to make nonparametric reliability function estimates. Figure 1 shows the maximum likelihood Kaplan-Meier and the maximum likelihood reliability estimates from ships and returns counts. They don’t agree, because there is more information in the grouped returns counts in the body of table 1 than in the bottom row. Is more information worth the cost of identifying every failure, survivor, or return by serial number, month of shipment, and month of failure?

Figure 1. Nonparametric reliability function estimates: Kaplan-Meier from the grouped returns counts in the table body (KMAll) and the maximum likelihood (MLEAll) from ships and returns counts.

Estimator variance (or standard deviation) is a measure of precision. Greenwoood’s variance estimates are for the Kaplan-Meier reliability estimator [Wellner (Greenwood)]. They are equivalent to the Cramer-Rao lower bounds on the estimator variance-covariance matrix, which are minus the inverse of (the expected value of) the second derivative matrix of the log likelihood function (aka Fisher Information matrix) [https://en.wikipedia.org/wiki/Cram%C3%A9r%E2%80%93Rao_bound]. Maximum likelihood estimators achieve the Cramer-Rao lower bound asymptotically, under some conditions.

The Cramer-Rao formula is used to estimate lower bounds on the variance-covariance matrix of nonparametric reliability estimates from ships and returns counts in table 2. Please refer to “Random-Tandem Queues…” chapter 5 for more comparisons [George 2019]. I also need to estimate the variance-covariance matrix of actuarial failure rates, because the variance of an actuarial forecast, SUM[a(s)]n(t-s), s=1,2,…,t], depends on the covariances of actuarial rate estimates, VAR[actuarial forecast]= SUM[VAR[a(s)]n(t-s)^2, s=1,2,…,t]+2*SUM[COVAR[a(s),a(t)]n(t-s)n(t), s<t=1,2,…,t]. The Cramer-Rao covariances of the Kaplan-Meier reliability function estimates are asymptotically zero but not the covariances of actuarial rate estimates from grouped life data (Kaplan-Meier) or from ships and returns counts, because the actuarial rates are a(s)=p(s)/(1-R(s)) s=1,2,…,k where p(s) and R(s) are estimates of the probability density and reliability functions.

Table 2 contains the reliability estimates in figure 1 and their standard deviations. The reliability estimates from grouped life data (Kaplan-Meier) and maximum likelihood estimates for ships and returns counts are pretty close (figure 1). For ships and returns counts, I assumed Poisson ships at constant rate equal to average of ships in first three periods. (That’s also the maximum likelihood estimator.). I will compute more of the Cramer-Rao Fisher matrixes if you want. (Mathematica computes the Fisher information matrix for probability estimates g(1), g(2), and g(3), for reliability estimates R(1)=1-g(1), R(2)=1-g(1)-g(2), and R)3)=1-g(1)-g(2)-g(3). So the reliability estimates’ variances are Var(R(1))=Var(g(1)), Var(R(2))=Var(g(1))+Var(g(2))+2Cov(g(1),g(2)), etc. The covariances were small.

The reliability estimates from grouped life data (Kaplan-Meier) and maximum likelihood estimates from ships and returns counts are pretty close (figure 1). The (Greenwood) standard deviations of the Kaplan-Meier estimators are less than the standard deviations of the maximum likelihood estimators from ships and returns counts, because grouped lifetime data contains more information than ships and returns counts. Is the additional information worth its cost? What if you don’t have grouped lifetime data in Nevada table? GAAP requires information containing population ships and returns counts.

Table 2. Alternative Kaplan-Meier reliability estimates and their standard deviations from ships and grouped lifetimes vs. maximum likelihood reliability estimates from ships and returns counts. The Fisher matrix standard deviations

Age, Months	Kaplan-Meier reliability	Greenwood std. Deviation	Max. likelihood reliability	Fisher matrix std. Deviation
1	1	0	1	0.000266
2	0.9996	0.00012	0.9997	0.000376
3	0.9977	0.00033	0.9965	0.001592
4	0.9946	0.00054	0.9965	TBD
5	0.9902	0.00076	0.9930	TBD
6	0.9847	0.00099	0.9780	TBD
7	0.9779	0.00128	0.9780	TBD
8	0.9701	0.00166	0.9780	TBD
9	0.9611	0.00209	0.9780	TBD
Entropy	0.2473		0.1748

Some people compare the entropy (Shannon information) in the alternative estimates [Bjorkman, Du et al.]. Entropy measures the information contained in a reliability function. Myron Tribus [https://en.wikipedia.org/wiki/Myron_Tribus/] advocated maximum-entropy reliability estimation after teaching thermodynamics and entropy to UCLA engineers like me. “The principle of MaxEnt states that the most probable distribution function is the one that maximizes the entropy given testable information,” [Du et al.]. (I believe the authors mean “testable information” to represent assumptions about the failure rate functions such as constant, bathtub, increasing, etc.) Maximum entropy is “a methodology to determine test value by estimating the amount of uncertainty reduction a particular test is expected to provide using Shannon’s Information Entropy as a basis for the estimate,” [Bjorkman] Entropy also measures the information in alternative data: grouped returns counts (Nevada table), sample or population, vs. population ships and returns counts.

The entropies at the bottom of table 2 are the sums of g(j)ln(g(j)), where g(j) is the discrete probability density function corresponding to the Kaplan-Meier or maximum likelihood reliability function estimates in table 2. There is more entropy (information) in the Kaplan-Meier estimate than in the maximum likelihood estimate. How much is the additional information worth, and how does additional information compare with variance reductions?

Empirical Comparison of Estimators and Their Uncertainty

Consider comparing estimators’ using broom chart input data. Broom chart input data is data from all periods as it accumulated: first period, first two periods,…, all but latest period, all periods. Comparing broom chart estimators’ standard deviation and entropy shows the effects of change with time and shows whether the reliabilities of later cohorts differ from earlier. The standard deviations of broom chart input data sets are not the same as the (square roots of) Cramer-Rao bounds, but they show the evolution of information as it accumulated. Figures 2 and 3 show the reliability function broom charts from Nevada table of grouped failure counts and from ships and returns counts.

Figure 2. Broom chart of Kaplan-Meier reliability estimates from successively earlier data sets; i.e. smaller Nevada tables with fewer time periods.

Figure 3. Broom chart of nonparametric maximum likelihood reliability estimates from successively earlier ships and returns counts data sets. Longer lines are from larger, more complete data sets, including more recent data.

Table 3. Standard deviations of reliability estimates from broom chart data sets. Estimates disagree, with more entropy (information) in the maximum likelihood estimates for some of the older ages.

Age, months	Kaplan-Meier	Max. Likelihood
1	0	0
2	7.7E-05	1.2E-16
3	6E-05	0.00012
4	3E-05	1.2E-16
5	8.4E-05	0
6	0.00011	0.00354
7	0.00012	0.00634
8	0.00504	0.00015
9	0.00836	0

Table 4. Broom chart reliability entropy estimates from broom chart data sets. Entropy decreases as data sets become smaller and earlier. The MLE entropy estimates don’t decrease as much the K-M entropy.

	All	All-last	All-2	Etc.				First 2
K-M entropy	0.2954	0.2869	0.2246	0.1415	0.10134	0.0665	0.0384	0.0182
MLE entropy	0.1748	0.10334	0.1042	0.1546	0.1231	0.0404	0.0205	0.0221

Uncertainty Quantification Includes Extrapolation

Dick Mensing suggested that I make nonparametric reliability and actuarial failure rate estimates by least squares from ships and returns counts. I.e., minimize the sums of squared differences between observed returns and actuarial hindcasts of returns, SUM[a(s)n(t-s), s=1,2,…,t], where a(s) is actuarial failure rate at age s and n(t-s) is the installed base of age t-s. See “Random-Tandem Queues…” for comparison of maximum likelihood and least squares estimators without life data [George 2018]. Least-squares estimates have asymptotic properties similar to Cramer-Rao bounds on variance: Gauss-Markov theorem, martingale central limit theorem, etc.

So I made a nonparametric least squares reliability estimates from the same ships and returns counts data as in table 1. Its reliability estimate (R(t) lse) is close to the Kaplan-Meier estimate. Its entropy is more than the entropy of the maximum likelihood Kaplan-Meier estimate and entropy from ships and returns counts 0.258 (LSE) vs. 0.247 (Kaplan-Meier). Figure 4 shows all three estimators. I use least-squares estimators for renewal processes from periodic ships and replacements, without knowledge of prior replacement counts [https://accendoreliability.com/renewal-process-estimation-without-life-data/, George 2002]

Figure 4. Nonparametric reliability function estimates.

I use regression to extrapolate least-squares actuarial rate estimates, because regression also uses least squares. The uncertainties in regression extrapolations are known: error bands, tolerance limits, prediction limits [https://accendoreliability.com/podcast/sor/sor-807-confidence-and-tolerance-intervals/#more-500434].

Uncertainty in Extrapolations

What if you need to extrapolate estimates to ages beyond the oldest failure in your sample or population? You need to extrapolate actuarial failure rates to quantify the uncertainty in actuarial failure forecasts, for stocking spares and planning maintenance to meet service level or backorder requirements and to plan for retirement or obsolescence.

A few years ago, there were publications about estimating actuarial rates for really old people [https://slidetodoc.com/mortality-trajectories-at-very-old-ages-actuarial-implications-2/]. It’s not easy to find samples, because really old people are scarce. Population ships and returns counts help. When I worked in the automotive aftermarket, parts’ dealers were delighted to have age-based parts’ forecasts for old Volkswagens still in circulation; they were based on population reliability statistics from vehicle installed base by age and monthly parts’ sales.

Reliability and failure rate function estimators quantify uncertainty up to the age(s) of the oldest failures in the sample or population. What is the uncertainty in extrapolation? What is the uncertainty in functions of the estimators such as actuarial forecasts SUM[a(s)n(t-s), s=1,2,…,t+1, t+2,…] beyond age of the oldest failure?

Extrapolate actuarial rates with Excel LINEST() regression and FORECAST.ETS.CONFINT() confidence interval. LINEST() produces a measure “R-squared,” “The coefficient of determination. Compares estimated and actual y-values.” It is the squared correlation between observed and predicted actuarial rates. I computed it for all subsets of estimated actuarial rates from all, all minus first,…,most recent,, and the largest R-squared was for all. FORECAST.ETS.CONFINT() gave an 95% upper confidence limit (UCL) for one-period extrapolation a(11) and the actuarial forecast UCL of 178 on returns in period 11, assuming ships of 3766 in period 11.

Table 5. Least-squares estimates and 95% UCL extrapolation of actuarial rate a(11) and forecast of expected returns in future period 11

Period	Ships	Returns	a(s)	E[Returns]
1	1623	0	8.1E-05	0.13
2	3723	1	0.00027	0.74
3	1319	5	0.0021	4.52
4	3600	12	0.00314	13.57
5	3298	23	0.00394	22.09
6	1333	38	0.00625	37.51
7	1584	55	0.00533	55.8367
8	4508	72	0.00943	71.52
9	4463	96	0.00575	96.05
10	3176	126	0.0192	125.97
11	3766	?	0.00619	180

If you want to forecast returns more than one period ahead, then you need simultaneous confidence bands on the forecast actuarial rates [Hall and Wellner] and future ships. SAS PROC LIFETEST computes confidence bands on Kaplan-Meier reliability function estimates but not on the maximum likelihood estimates from population ships and returns counts, because SAS doesn’t know how to estimate reliability from ships and returns counts.

Recommendations?

Preserve all relevant information in field reliability data. Use it. Test assumptions. Quantify uncertainty as well as you can. There are lots of ways to express uncertainty. I prefer data-based and physics based statistical uncertainty. However, some field reliability uncertainty is human-based, so take human factors into account, quantitively if possible. If you use subjective opinion surveys use https://en.wikipedia.org/wiki/Dempster%E2%80%93Shafer_theory or expert questionnaires and salt the surveys with known answers {George and Mensing].

If you would like the workbook used for preparing this article or if you want the Mathematica workbooks for computing the Fisher information matrixes and standard deviations for maximum likelihood estimators and actuarial forecasts from ships and returns counts, ask pstlarry@yahoo.com.

References

Eileen A. Bjorkman “Test and Evaluation Resource Allocation Using Uncertainty Reduction as a Measure of Test Value,” PhD thesis, George Washington Univ., August 2012

Yi-Mu Du, Yu-Han Ma, Yuan-Fa Wei, and Xuefei Guan “Maximum Entropy Approach to Reliability,” Physical Review E101(1), DOI:10.1103/PhysRevE.101.012106, January 2020

L. L. George and A. Agrawal “Estimation of a Hidden Service Time Distribution for an M/G/Infinity Service System,” Nav. Res. Log. Quart., vol. 20, no. 3, pp. 549-555 , 1973

L. L. George “Renewal Estimation Without Renewal Counts,” INFORMS conference, San Jose, CA, Nov. 2002

L. L. George “Random-Tandem Queues and Reliability Estimation, Without Life Data,” https://drive.google.com/file/d/1Ua8qJ5XeTMbsXKQ2CskGL-MMgslO9OWu/view, Dec. 2019

L. L. George and Richard W. Mensing “Using Subjective Percentiles and Test Data for Estimating Fragility Functions,” Proceedings, Department of Energy Statistical Symposium, Berkeley, CA, Oct. 1980

W. J. Hall and Jon A. Wellner “Confidence Bands for a Survival Curve from Censored Data.” Biometrika, vol. 67, no. 1, pp. 133–43, https://doi.org/10.2307/2335326. Accessed 22 Nov. 2022, 1980

Noel M. Mirasol, “The Output of an M/G/Infinity, Queuing System Is Poisson,” Operations Research, Vol. 11, No. 2, pp. 282-284, Mar.-Apr. 1963

Partric Oscarsson and Örjan Hallberg “ERIVIEW 2000 – A Tool for the Analysis of Field Statistics,” Ericsson Telecom AB, http://home.swipnet.se/~w-78067/ERI2000.PDF, 2000

Bernard Ostle and Richard W. Mensing, Statistics in Research, Iowa State Univ. Press, 1975

Jon A. Wellner “Notes on Greenwood’s Variance Estimator for the Kaplan-Meier Estimator,” https://sites.stat.washington.edu/jaw/COURSES/580s/582/HO/Greenwood.pdf/, 2010

Comments

Larry George says
December 21, 2022 at 12:22 PM
Thanks to Fred for rendering all the symbols and formulas in the article.
Some mistakes were unintentional:
“independent< Poisson increments" should have been "independent, Poisson increments"
"SUM[a(s)]n(t-s), s=1,2,…,t]" should have been "SUM[a(s)*n(t-s), s=1,2,…,t]" two places.
Sorry. MEGO
If you want to original article *.docx, the Excel workbook, or the R-code used in this article, let pstlarry@yahoo.com know.