Bivariate Reliability Estimates from Survey Data

Isn’t it enough to estimate the age-specific field reliability functions for each of our products and their service parts? Of course we quantify uncertainties in estimates: sample uncertainties and population uncertainties due to changes or evolution. That’s information to forecast service requirements, recommend spares, optimize diagnostics, plan maintenance, warranty reserves, recalls, etc. What else could we possibly need or do?

My wife found that beautiful picture in Norwegian news. Reuters 3/17/2016 reported “there was a short in the distribution box which led to a Tesla Model S fire while it charged. But exactly why it happened, is not possible to determine, according to Tesla.”

It was an isolated incident where a Model S caught fire while it used a Supercharger. The cause was a short circuit in the distribution box in the car. Superchargers were turned off immediately when the short circuit was discovered. No one was injured in the fire. Our investigation confirmed that this was an isolated incident, but due to the damage to the car, we could not definitely identify the exact cause of the short circuit.

The damage to the car is a consequence of the fire department decided to leave the car burning down, according to Tesla. Tesla feel confident that the fire was a one-time event. Tesla vehicles are safely used Tesla Supercharger stations over 2.5 million times, in addition to the roughly 35 million successful charging sessions at large.

Tesla said that from a precautionary perspective nevertheless will launch a software update to the Model S a few weeks, to provide extra security during charging. The update will include a diagnostic solution to prevent charging if it is a potential short circuit, it becomes illuminated.

Why Estimate Field Reliability?

Consumer bills of rights entitle consumers to product information, which I interpret to include reliability information. I have not had much luck getting product field reliability data from companies. Their reliability people may not realize that ships and returns counts are statistically sufficient to make nonparametric estimates of age specific field reliability and failure rate functions. What else could be done?

Http://www.pluginamerica.org surveyed electric car owners. I used their Tesla Model S battery, charger, and drive-unit field data to make nonparametric estimates of age-specific field reliability, including bivariate reliability, to dispel some reliability myths (constant failure rate, Weibull, and Independence) and to demonstrate how bivariate data could be used to plan opportunistic maintenance, stock spares, recommend “Customers who bought ‘XXXX’ also bought ‘YYYY’”, plan repair kits, recognize common cause problems, do multivariate risk analyses, estimate warranty costs, and ???.

Cohort reliabilities differ and are not necessarily improving, so test hypotheses and quantify dependence (partA, partB) and (TTFF, TBF1, TBF2,..).

Why Estimate Bivariate Reliability?

Walter Shewhart’s rule #1 (paraphrased) “Original data should be presented in a way that will preserve the evidence in the original data for all the predictions assumed to be useful.” Doesn’t that include dependence? For example, warranty cost depends on dependence; e.g., E[SUM(W(i))] is not equal to SUM(E[W(i)]), where W(.) represents warranty cost for parts I and j or same part of different ages.

Why not advance reliability analyses into nonparametric bivariate statistics, without unwarranted assumptions and without life data, and with dependence and censored data?

Data are a Survey Sample

http://www.pluginamerica.org/surveys/batteries/model-s/results.php listed 433 Tesla Model S responses dated from 5/14/2013 to 1/4/2016: 392 US, 26 Canada, 5 Norway, 4 Germany, 1 Belgium, 1 Suisse. These fields were used: date_submit, date_acquired, batt_swapped, batt_swap_count date, charger_swapped, charger_swap_count, date_charger_swap, drive_unit_swap_count date_, drive_unit_swap odo_, drive_unit_swap date_, drive_unit_swap_1, dist_drive_unit_swap_1 unit_. Http://insideevs.com/monthly-plug-in-sales-scorecard/ estimated monthly vehicle sales. In June 2016, I added another ~500 responses. It’s not a random sample, and it’s length-biased. Some replacements were replaced, but no age at 1st swap was noted, except for drive units.

Table 1. Survey Response Rate January 2016

Year acquired	US Sales estimate	Sample	Response Rate
2012	2650	85	3.21%
2013	17650	254	1.44%
2014	17500	56	0.32%
2015	25700	39	0.15%
Total	63500	434	0.68%

Methods

Excel workbook contains: Kaplan-Meier nonparametric maximum likelihood reliability estimator (npmle) for battery, charger, drive-unit times-to-failures (TTFs aka “swap”), and drive-unit times-to-first-failures (TTFFs) report. TBF1 = Time Between 1^st and 2nd swaps. Greenwood’s standard deviation of reliability estimate is usable for a confidence limit, valid at one age. Weibull was fit to the npmle by least squares. I extrapolated failure rate functions for nonparametric MTBF and standard deviation estimates. I estimated bivariate npmle of pdf of TTFF and TBF1 without and with dependence: TTFF is hidden in TTF if 2 swaps were reported. I estimated nonparametric bivariate reliability estimates of {TTFF, TBF1} (battery), {TTF(battery, TTF(charger)}, {TTFF, TBF} (drive-units).

Spreadsheets in workbook are:

CSV: input to R for drive units’ bivariate distribution {TTFF, TBF}

Batt: Nevada table and BattReliability; Kaplan-Meier, Weibull, extrapolations, and inputs for bivariate distribution

Battnpmle and Battnpmle (2): independent and dependent bivariate npmle reliability estimates {TTFF, TBF1}

Charger (2), ChargerReliability, Drive and DriveReliability: K-M, Weibull…

Bivariate and Bivariate (2): nonparametric distribution {TTF(Battery), TTF(Charger)} , marginal extrapolations, MTBF, stdev, and correlation, conditional on TTFF and TBF1 < 40 months

Correl: correlations (PartA, PartB) conditional on multiple failures, jackknifed

Forecast: actuarial forecasts and warranty cost estimates

Figure 1 is a broom chart that shows battery Kaplan-Meier reliability (time-to-first-failure) estimates from annual cohorts. The chart shows different cohorts (year acquired) had different reliabilities. Notice the improvement? Shorter lines from later cohorts lie above “All June 2016” line. Figure 1 also shows reliability estimates in January 2016 (36 replacements) and in June 2016 (39 replacements).

Figure 1. June 2016 battery TTFF Broom Chart: 39 replacements, 34 TTFFs, 2 each in 2014 and 2015

Figure 1 includes the Weibull reliability estimate fit to all data, the smooth curve that lies in the middle. Would Weibull be a reasonable fit to annual cohorts? Table 2 shows the Weibull parameter estimates from all data in January 2016 and later in June 2016. Scale parameter and MTBF estimates are in months: Jan/June. The January 2016 estimates didn’t seem to recognize infant mortality beta < 1.0. Weibull infant mortality leads to absurd MTBF overestimates. This instability in Weibull parameter estimates makes me prefer nonparametric statistics.

Table 2. Weibull parameters estimated by least-squares fit to the figure 1 Kaplan-Meier nonparametric reliability estimates, by all and by annual cohorts.

Year cohort®	All	2012	2013	2014&2015
Alpha	261/608	146/194	325/1103	246
Beta	1.16/0.903	1.21/1.07	1.16/0.83	0.82/0.67& 0.54
Weibull MTBF	247/608	137/189	308/1103	271/608
Nonparametric MTBF	72/185	145/136	56/223	170/374& 240

Bivariate Analyses

Estimate nonparametric bivariate distributions assuming TTFF and TBFs are dependent, to deal with some repeated battery and charger replacements and with pairs of different parts: battery, charger, or drive-unit times-to-failures (Bivariate spreadsheets) [Lin and Ying]. Estimate bivariate distributions of TTFF and TBFs for correlation and forecasting [R, SurvivalBIV]. “We expect the bivariate estimator to have good efficiency,…when censoring is light.” [Lin and Ying], and vice-versa. Compute correlation from bivariate distribution estimates, unconditionally. Bivariate graphs were prepared in Mathematica.

Figure 2. Battery TTFF (R1(t)) and TBF1 (R2(t)) reliability estimates assuming independence (Jan. 2016 computation). “K-M R(t)” is the Kaplan-Meier reliability estimate for both.

Figure 3. *Bivariate* Battery marginal reliability estimates assuming TTFF and TBF1 are *dependent.* [Lin and Ying] Correlation{TTFF, TBF1} = 30% conditional on TTFF and TBF1 < 40 months. TTF is TTFF+TBF1.

Figure 4. Bivariate probability density function of battery TTFF and TBF1.

Dependence Matters

Both dependence of TTFF and TBF1 and for different parts times-to-failures (figure 6)! Figure 5 shows different battery reliability estimates for TTFF and TBF1:

P[TTFF > 36 months|independence] is 93.5%, and

P[TBF1 > 12 months|independence] = 80% versus

P[TTFF > 36 months|dependence] is 93.1%, and

P[TBF1 > 32 months|dependence] = 80%.

Figure 5. Battery reliability estimates assuming independence or dependence (“Bivar”)

Figure 6. Marginal reliability function estimates from bivariate distribution of battery and charger TTFFs

Figure 7. Bivariate reliability function estimate of battery and charger TTFF. X-axis is charger and Y-axis is battery; axes’ time units are days. Correlation is 85%.

*bivariate reliability,* to dispel some reliability myths (constant failure rate, Weibull, and Independence) and to demonstrate how bivariate data could be used to plan opportunistic maintenance

How to use multivariate reliability?

Dependence could be actionable. Is there cause and effect? Could battery replacements be made independent of charger replacements, of drive-unit replacements? Could opportunistic maintenance help? (Opportunistic maintenance replaces other parts when replacing one part if other parts contribute to failure of the part that failed?)

10% of batteries were replaced within first 3 years (stdev is 1.9%). Isolate batteries from charger or drive-unit failures? ~$10-$12,000 cost?

6% of chargers were replaced within first 3 years (stdev is 1.2%). Isolate from battery or drive-unit replacements? $2300 cost?

16% battery OR charger were replaced within first 2 years (stdev is ?)

26.5% of first drive-unit replacements within first 3 years. (stdev 2.5%). Retail cost ~$15,000 per replacement!

2014 and 2015 battery, charger and drive-unit replacements had infant mortality due to: process defects, eager service, opportunistic maintenance, shotgun diagnostics?

Recommendations

Compare survey and population reliability estimates. Account for multiple replacements using generalized renewal process, EM algorithm, and bivariate distributions. Incorporate dependence in warranty cost forecasts: Var[X+Y] = Var[X]+Var[Y]+2Cov{X,Y].

Do what-if analyses. Evaluate fixes’ effects on warranty costs. Equate marginal returns for fixes of different parts. Plan optimal opportunistic maintenance for correlated parts.

I have not included all the derived field reliability information about battery, charger, and drive unit and all pairs. If you want more, ask pstlarry@yahoo,com, ask for original presentation, or send population field data if you want information about your products and their service parts.

References

Lin, D. Y. and Zhiliang Ying, “A simple nonparametric estimator of a bivariate survival function under univariate censoring,” Biometrika (1993), 80, 3, pp. 573-81

Lin, D. Y. et al., “Nonparametric estimation of the gap time distributions for serial events with censored data,” Biometrika (1999), 86, 1, pp. 59-70

Moreira, Ana and Luis Meira-Machado, “survivalBIV: Estimation of the Bivariate Distribution Function for Sequentially Ordered Events Under Univariate Censoring,“ J. Statistical Software, Vol. 46, No. 13, March 2012