Semi-Nonparametric Reliability Estimation and Seasonal Forecasts

I estimated actuarial failure rates, made actuarial forecasts, and recommended stock levels for automotive aftermarket stores. I wondered how to account for seasonality in their sales? Time series forecasts account for seasonality but not for age, the force of mortality accounted for by actuarial forecasts. I finally figured out how to seasonally adjust actuarial forecasts. It’s the same method, David Cox’ “Proportional Hazards” model, used to make “Semi-Parametric” estimates and “Credible Reliability Predictions”.

Does AI require a lot of data to fit high order polynomials?

A coworker at Lawrence Livermore Lab recently wrote to me, “I found a description of how AI is being developed to make forecasts about health care, consumer buying trends, etc. It’s all based on inputting data to develop probability distribution functions. AI companies are starting to run out of data so they’re buying raw data from YouTube and other sites in order to “create” enough data for their probability distributions. Do you know anything about this? It sure sounds like that’s tailor-made for probability risk analysis and basic event probability development. Thoughts?”

I use periodic ships and returns counts, periodic case and death counts, or other beginning and end-event count data, by cohort. That data is sufficient to make nonparametric reliability function estimates, without lifetime data. Ships and returns counts are contained in or computed from data required by generally accepted accounting principles.

Actuarial rates, a(t)=-d ln(R(t))/dt, are derived from reliability function estimates, R(t). The actuarial demand forecast for auto parts is ∑[a(s)n(t-s)], s=0,1,…,t, where n(t-s) is the number of vehicles of age t-s in neighborhoods of auto parts’ stores. Actuarial forecasts account for reliability based on ages of vehicles that use the parts [George 2004]. I recommended stores’ spare parts’ stock levels for Triad Computer Systems. If your auto parts store didn’t have what you needed in the late 1990s, it could have been my fault.

Even without accounting for seasonality, Triad’s actuarial forecasts and stock recommendations were so good that Triad was bought by a competitor who laid us off to save money. They contracted www.smartcorp.com to make time series forecasts, which may account for seasonality, but not wearout, the force of mortality or the ages of the vehicles.

Will AI Account for Seasonality?

I played with neural networks while working for Lawrence Livermore lab (LLNL). I helped a chemist doing experiments on safing the W79 nuclear artillery shell. He had run several 100s of test combinations without concluding what chemistry, quantity, and timing to use. I used an Ivakhnenko polynomial, www.gmdh.net, to model his test results and design subsequent experiments. I also used it for Marvin Windows to recommend sealant chemistry based on some tests [George 2016]. I used a FORTRAN program [Farlow] to fit an Ivakhnenko polynomial to test results, to design additional experiments and recommend safing method and sealant chemistry.

Neural network models are essentially high-order polynomials such as the Ivakhnenko polynomial. Neural network models are what AI programs use to fit data. My LLNL experience led me to found the Artificial Stupidity Research Institute (ASRI), which seeks and exposes examples of lazy approximate solutions to problems that could be solved mathematically, statistically, or deterministically based on physics, reality, and available data. ASRI also attempts to expose misguided or misleading standards that are stupid, lazy, or contain errors and faulty recommendations.

Epicor www.epicor.com is a successor to Triad Computer Systems where I used reliability statistics to estimate auto parts’ actuarial (age-specific) failure rates, etc. Since I was laid off by the automotive aftermarket company,www.predii.com got a contract from Epicor to use AI on “unstructured” automotive data to derive “service insights”, without accounting for vehicle age or mileage. Neither time-series nor AI accounts for age-specific failure rates or seasonality.

Do the Best You Can With Available Data? – Accendo Reliability is a recent article attempting to counter the AI greed for data. People aren’t using available data required by generally accepted accounting principles to do reliability statistics. AI and its greed for data is an alternative to statistics and work! AI could account for seasonality, if you program it. Here is an easier way.

Figure 1. Monthly frequency of four Nigerian refinery pumps, 2008-2015

Seasonally Adjusted Actuarial Forecasts?

It’s the same method used to make a “Credible Reliability Prediction” [George 2023]. Seasonal actuarial rate (age-specific ) estimation uses Cox’ PH model on season and month shipped in estimating actuarial failure rate functions, a(t;Z(cohort, season))=ao(t)Exp[βZ]. Vector Z={Z1,Z2} is a cohort period and forecast season and vector β contains regression coefficients.

David Cox’ proportional hazards model assumes failure rate functions for related products’ or parts’ lifetimes could be proportional with factor Exp[βZ] where Z accounts for product or part differences: β is a regression coefficient and ao(t) is an underlying failure rate function.

I used Cox’ proportional hazards model to make credible reliability predictions from previous generations of parts, their failure rate functions, and the ratios of MTBF(new)/MTBF(old). This is because generations of parts have proportional failure rate functions and the only knowledge during design of a new product the failure rate functions of older generations of similar parts and MTBF(new) based on MIL-HDBK-217 or other MTBF prediction. [George 2023]

What are Z variables and Proportionality Coefficients?

Suppose you have data in the form of a Nevada table: periodic cohort sales, ships, or installed base counts, and failure counts, grouped by cohort. (table 1) The Kaplan-Meier nonparametric reliability function is the maximum likelihood estimator using failure counts grouped by age, regardless of which cohort the failure came from [Kaplan and Meier]. The K-M estimator doesn’t account for cohort variations or seasonality of failure counts.

Table 1. Nevada table of ships and grouped failure counts by cohort: 18 months.

Cohort	Ships	Period 1	Period 2	Period 3	Period 4	Period 5	Period 6	Etc.
1	47	1	3	7	8	13	5
2	41		4	3	4	7	6
3	45			2	4	9	6
4	39				1	6	4
5	43					2	6
6	41						0
Etc.
Sums	711	1	7	12	17	37	28

Table 2. Kaplan-Meier reliability R(t) and failure rate function a(t) estimates, t = 1,2,…,18

Age, t	K-M R(t)	K-M a(t)
1	0.9648	0.0352
2	0.8556	0.1132
3	0.6927	0.1905
4	0.5295	0.2356
5	0.3524	0.3344
6	0.2140	0.3927
Etc.

Why not make the maximum likelihood failure rate function estimates from each cohort’s failures? Table 3 and figure 2 show variation around the K-M estimator that may be due to cohort and season. Figure 2 shows cohort failure rate functions are proportional, to each other and to the Kaplan-Meier estimator.

Table 3. Cohort failure rate function estimates derived from each cohort row of table 1.

Period/Age	1	2	3	4	5	6	Etc.
1	0.036	0.062	0.173	0.177	0.372	0.344	Etc.
2	0.073	0.063	0.136	0.235	0.308	0.294
3	0.040	0.097	0.200	0.231	0.292	0.294
4	0.026	0.297	0.156	0.250	0.333	0.214
5	0.027	0.293	0.171	0.207	0.348	0.333
6	0.022	0.085	0.163	0.222	0.286	0.350
Etc.

Figure 2. Broom chart of cohort reliability function estimates. Fat orange line is Kaplan-Meier estimate.

Derive the proportional hazards model for cohort actuarial rate function estimates, a(t;Z(cohort), Z(season)) = ao(t)EXP[Z1*β1+Z2*β2]. Z(cohort)=Z1 is cohort age, date, or period number and Z(season)=Z2 is season index number. Z1 accounts for cohort variations that may not be due to season (table 4). Use least squares to minimize the squared differences between the observed cohort failure counts and actuarial estimates, as functions of ao(t), β1 and β2. If there is periodicity among the cohort failure rate functions β1 will be non-negligible in magnitude. If β2 is non-negligible, then you could combine cohort data within seasons for estimating cohort actuarial rates; if you find failure seasonality and have more than one year’s data, you could combine data from same seasons.

Notice that the cohorts in the first column have Z1 values 1,2,3,1,2,3,…, because there is apparent quarterly repetition. The magnitude of β1 indicates quarterly periodicity. Annual seasonality in failures is apparent because of the magnitude of β2. According to time series FORECAST.ETS, there is no apparent seasonality in period failures, the bottom row of table 1.

Table 4. Least squares estimates of proportional hazards model: ao(t), and coefficients β1 and β2. Seasonality period is 1,2,…,12.

β1	0.3140		SSE->	284
β2	0.0234
Cohort	ao(t)	Period1	2	3	4	5	6	Etc.
1	0.022	0.202	7.321	0.298	0.084	0.296	0.441
2	0.054	3.357	0.800	0.401	3.272	1.879	2.051
3	0.065	0.146	3.902	3.935	3.585	0.358	1.686
1	0.150	0.200	0.310	0.355	0.488	0.464	0.240
2	0.157	20.903	1.181	0.325	0.048	0.158	2.174
3	0.133	0.066	1.209	1.547	5.124	0.598	0.774
Etc.	0.268	0.061	1.928	1.637	2.044	0.770	0.236

Figure 3. Actuarial failure rate estimates: npmle is from ships and returns counts, K-M is from Kaplan-Meier reliability, and ao(t)Exp(ZBeta) is adjusted for season and cohort

Given seasonal actuarial rate function estimates, including cohort periodicity, make period 19 actuarial forecasts, specific to a season, using the cohorts’ actuarial rate estimates, with Z2 corresponding to the season 7 =19 mod 12.

Table 4 shows observed period failure count sums vs. Excel’s FORECAST.ETS time series actuarial Kaplan-Meier hindcasts. Excel’s FORECAST.ETS includes seasonality, if detected, (ETS stands for Exponential Time Series.) Kaplan-Meier actuarial hindcasts (forecasts of past failures) “K-M” seem OK: SSE=285. Notice the table is folded to show all 19 periods?

Table 4. Observed grouped failure counts and hindcasts. Time series FORECAST.ETS hindcasts lag “FCST.ETS”: SSE=3334.

Period	1	2	3	4	5	6	7	8	9	10
Observed	1	7	12	17	37	28	33	40	44	47
FCST.ETS	8.79	10.45	12.11	13.77	15.43	17.09	18.75	20.41	22.07	23.72
K-M	1.6	6.6	14.0	21.2	29.7	33.1	38.8	41.6	43.0	43.6
Continued
Period	11	12	13	14	15	16	17	18	19
Observed	46	44	41	32	44	31	29	37	????
FCST.ETS	25.38	27.04	28.70	30.36	32.02	33.68	35.34	37.00	38.66
K-M	42.9	41.6	37.7	36.5	37.0	31.9	33.0	35.2	34.12

Table 5 shows alternative period 19 forecasts. FCST.ETS extrapolates period returns in bottom row of table 1. Kaplan-Meier uses failure counts grouped by age from table 1 without regard to cohort. Cohort 7 actuarial rates are used to forecasts period 19 returns, because 7 = 19 mod 12. Cohorts 1, 4, 7, 10 averages cohort effects. The semi-nonparametric uses the cohort 1 and season 7 Z-values to incorporate seasonality.

Table 5. Forecasts for period 19 without and with seasonal adjustments Z1= 19 mod 3, Z2= 19 mod 12 = 7, or both. All except FCST.ETS are actuarial forecasts.

Method	Forecast
FCST.ETS	38.7
Kaplan-Meier	35.3
Cohort 7	39.31
Cohorts 1, 4, 7, 10 average	35.91
Seasoned Semi-NP	38.08

Conclusions? Recommendations?

Why hasn’t seasonal actuarial forecasting been done? I learned seasonal time series forecasts in school. Why don’t time series forecasts account for age-specific failure rates? Time series extrapolation is easier. Why not use the Kaplan-Meier estimator and ignore cohort variations? It’s in most statistical software along with the faulty Greenwood variance estimator [George 2024].

Want semi-nonparametric estimates and seasonal actuarial forecasts? Send grouped lifetime or failure count data in form of Nevada table or send observed or censored lifetimes in form of inputs to SAS, MiniTab, ReliaSoft, Weibull, JMP, or your favorite nonparametric estimation software (pstlarry@yahoo.com).

Did you notice that the semi-nonparametric estimation used cohort ships and failure counts? Kaplan-Meier estimator uses failure counts, grouped by age. It ignores cohort variability and gives different reliability estimates and actuarial forecasts. Why not use all information in data?

The seasonal actuarial forecasting method is for dead-forever lifetimes or grouped failure counts, not for periodic renewal counts that do not indicate numbers of preceding failures. I have an idea of how to deal with renewals, and I am looking for data to try it.

References

David Cox, Proportional hazards model – Wikipedia

Stanley J. Farlow, “A FORTRAN Program for the GMDH Algorithm,” Chapter 15 of Self-Organizing Methods in Modeling,CRC Press 1984

Laurence L George, “Actuarial Forecasts for the Automotive Aftermarket,” SAE Transactions, Vol. 113, Section 5, J. Materials and Manufacturing, pp. 697-701, 2004

L. L. George, “What Were You Thinking? PH-GMDH Sealant Reliability Model,” Test Engineering and Management, Vol. 78, No. 5, Oct.-Nov., 2016

L. L. George, Credible Reliability Prediction, 2^nd Edition, Credible Reliability Prediction? – Accendo Reliability, 2023

L. L. George, “What if Ships Cohorts Were Random,” Weekly Update, What if Ships Cohorts Were Random? – Accendo Reliability, Jan. 2024

Kaplan, E. L.; Meier, P. “Nonparametric estimation from incomplete observations”. J. Amer. Statist. Assoc., Vol. 53(282), pp. 457–481.doi:10.2307/2281868, JSTOR 2281868, 1958

Does AI require a lot of data to fit high order polynomials?

Will AI Account for Seasonality?

Seasonally Adjusted Actuarial Forecasts?

What are Z variables and Proportionality Coefficients?

Conclusions? Recommendations?

References

About Larry George

Comments

Leave a Reply Cancel reply