Azmat Siddiqi suggested a certification in reliability statistics in 2022. Azmat believes in knowing and using the reliability statistical information in test, installed base, failures, and service data. Thanks Azmat.
I propose Certification in Reliability Statistics to recognize statistics knowledge, work experience, and applications. Certification in Reliability Statistics should provide assurance to employers, contractors, and collaborators that reliability statistics are estimated and used to the best extent with available data, including uncertainty quantification, with or without life data.
“Reliability is the probability that an item will perform a required function without failure [according to customers] under stated conditions [in the field, in the hands of customers, not in a lab] for a stated period of time [operating or calendar time depending on use of reliability information]. [O’Connor and Kleyner] [Brackets contain my comments.] Reliability is not MTBF, KPI, availability or other single valued number. It is a probability function of time or age. How can you do reliability engineering without estimates of the reliabilities of products and their parts?
What do Reliability Engineers Do?
Typical reliability job advertisement reads, “…is currently looking for a Reliability Engineer with a BSEE background.” The state-of-the-art in Reliability Statistics education is pretty elementary. The Accreditation Board of Engineering and Technology requires one undergraduate engineering statistics course.
The ASQ offers a Reliability Engineering Certification (CRE). The CRE exam covers material for elementary reliability statistics and applications. I complained the CRE exam was boring and repetitious. A person responsible for the exam content explained, “The CRE exam has to be pretty elementary so people could pass and receive the certificate. The exam attempts to assure a minimal level of qualification.” The Certificate in Reliability Statistics represents the maximal level of reliability performance.
Reliability engineering standards have become armchair exercises (ACM, AFR, Availability, DFR, FMEA, FMMEA, FMECA, FORM, FRACAS, FTA, LDA, PoF, RCM, RCA, RGA (Duane-AMSAA Crow), MTBF prediction, MCF, MTTR, RBD, etc.) Reliability statistics software is based on simplifying assumptions (FTA, MTBF prediction, availability, RBD simulation, Weibull,…). Reliability standards, guides, and computer programs do familiar reliability tasks and elementary statistical estimation and simulation.
Some reliability engineers are lucky enough to have (sample?) field life data for their company’s products, typically right censored. This allows use of the Kaplan-Meier nonparametric reliability estimator. However, its commonly assumed Greenwood asymptotic variance estimator errs badly for finite samples. Is it worth tracking products, even a sample, by name or serial number, from shipment to return or failure? Lifetime data is nice, but can you justify tracking products and parts by individual serial number from birth to death, even a sample.
Lifetime data may be required by some government agencies: the Air Force tracks engines and major engine modules, the FAA used to require tracking approximately 75 commercial aircraft parts [Dickstein], and the FDA requires tracking implantable medical devices by serial number. I told the USAF AMC, FAA and FDA that ships and returns counts are statistically sufficient for all products and service parts and were population data [George and Agrawal].
Traditionally, reliability engineers make MTBF estimates from reliability block diagrams, parts’ failure rates, and fudge factors representing physics of failures, environments, etc. IEEE 1413.1 and many others]. This tradition has been extended to estimating Weibull reliability parameters, without life data, from parts’ MTBFs and its variances [Chen et al.].
What Do Certified Reliability Statisticians Do?
They use all available reliability data, including field data, what really happens, for the sake of credibility. They estimate and use reliability and failure rate functions, without unwarranted assumptions, with or without lifetime data. They preserve all relevant information in available data. {Walter Shewhart’s rule number one.] If products and parts are not tracked by serial number, they use ships and returns counts from data required by GAAP (Generally Accepted Accounting Principles). Ninety-One percent of poll respondents believe lifetime data is required, by tracking products or parts by name and serial number! [George, “Poll…” June 2023]
Population ships (cohorts, installed base by age) and returns (complaints, repairs, failures, inventory, spares sales, etc.) data is required by GAAP [Generally Accepted Accounting Principles]. Ships and returns counts are free, but you may have to work! Ships’ counts are in revenue = prices*sales. Returns’ counts are in service and warranty costs or spares sales. Convert product installed base into parts’ installed base by age using BoMs (Bills of Materials) and “gozinto” theory [George, 2022].
Reliability Engineering Statisticians:
- Collect test and field reliability data for all products and service parts and organize it into age-specific cohorts’ data. (Include the machines that produce the products or parts from your factories or vendors’ factories if appropriate.)
- Estimate field reliability and failure rate functions for all products and their service parts, without unwarranted assumptions, (with or without lifetime data) and quantify the (sample or population) uncertainty or variance in the estimates.
- Feed that information back to design, process, installation, training, maintenance, and accounting people in support of their activities. Help improve product and systems safety, reliability, inventories, diagnostics, maintenance, and management.
- Statistical Reliability Process (SRP) control of field reliability using broom charts and Kullback-Leibler divergence as it evolves over time and product life cycle [George 2023].
- Make credible reliability predictions for new products. and quantify reliability (not MTBF!) growth, based on population field data. Generations of products use same or similar parts, are produced using the same processes, shipped and installed, and used in same environments by same customers. Why not use past reliability information?
- Justify the costs of alternative or supplementary data with bang per buck (E.g., sample of life data vs. free population ships and returns counts required by GAAP).
- Support service organizations and customers with reliability-and risk-based diagnostics.
- Forecast demands for service, spares, replacements, and quantify demand distributions induced by randomness and uncertainties.
- Quantify attrition and support end-of-life product plans.
- Seek other applications of reliability statistics.
Generations of products use same or similar parts, are designed by the same designers, produced by the same processes, shipped and installed the same ways, sold to the same customers, and used in the same environments. Why not use information from the past too?
If your products or parts have warranties or maintenance, use nonparametric reliability function estimates! Popular statistical reliability functions do not accommodate glitches in failure rate functions. WEAP (Warranty Expiration Anticipation Phenomena) causes returns around warranty expiration, because of hoarding, improper sales, or sell-through and return times, as well as failures. PM (preventive maintenance) causes returns around maintenance times, because maintenance may detect or even cause failures. Attrition could cause decreasing failure rate as products are retired.
If necessary, estimate multivariate reliability functions to capture dependence among different parts’ failures or among different failure modes for a product or part. Kits are parts collected to deal with known product failure modes. People who buy part A may need part B too.
Where Does the Reliability Statistics Knowledge Come From?
Applied statistical technologies include but are not limited to: nonparametric estimates of parts’ and products’ reliability statistics by failure mode; credible reliability predictions; failure modes and effects reliability diagnostics; budget-constrained reliability apportionment; planning, operation and analysis of reliability testing based on field reliability; process control with broom charts; extracting ships and returns counts from data bases and converting them into parts’ installed base by age. All of this and more is needed to apply reliability statistics to failure analysis, design and performance improvement and reliability program management over the entire product life cycle.
Certification in Reliability Statistics is not available from ANSI, ISO, ASME, ASQ, SAE, IEEE, or other organizations’ standards or guides. This certification exceeds recognized national and international credentialing industry standards for a program’s development, implementation, and maintenance, because it incorporates reality in reliability statistics.
What are the Requirements for Certification in Reliability Statistics?
The requirement is to know and use nonparametric reliability estimates and uncertainty for all products and service parts, with or without life data and without unwarranted assumption. No exam is required, because so many alternative methods and data lead to field reliability statistics knowledge:
- Lifetime data, including censored or suspended lifetimes, for a sample or population
- Cohorts (installed base by age) and grouped failure counts (for Kaplan-Meier estimator)
- Population ships and returns counts
Try for data failure modes too, and estimate multivariate reliability functions.
If you feel the need for an exam more in depth than the ASQ Certified Reliability Engineer questions, see
https://drive.google.com/file/d/1y63XBPOC_Dx9d95V_Od_hbJlvHRhz9T8/view/ for a version of the Minnesota Cognitive Ability exam redesigned for reliability statistics.
Submit some evidence of the following or equivalent:
- Nonparametric reliability or failure rate estimates for products and their service parts, incorporating test and past field information. (Alternatively, submit evidence that traditional reliability functions are justified by physics as well as test and field data.)
- Quantification of the uncertainty in those estimates. (E. g. variance-covariance matrix of age-specific reliability function estimates at all ages within useful life, if normal or lognormal distribution is a valid asymptotic approximation.)
- Application of field reliability estimates to reliability prediction, reliability (not MTBF) growth, safety and risk analyses, inventory, service and maintenance, diagnostics, plus attrition and product retirement planning.
- Other applications as needed or beneficial to customers and the environment.
If you want help obtaining this evidence, please let me know.
What Training and Help is Available?
The following sections review the level of statistics available in reliability training, certifications, standards, software, and training programs. They are not enough to retrain an engineer into a reliability engineer who learns and uses the field reliability statistics of his or her products and service parts.
What should we do to remedy this? What are the consequences of current statistical state of the art in reliability engineering? Is Certification in Reliability Statistics helpful? The American Statistical Association thinks so [https://www.amstat.org/your-career/accreditation]. The American Statistical Association “accreditation” requires a list of degrees and coursework + work experience for “professional” certification.
Compare Other Certifications with Certification in Reliability Statistics?
“The ASQ Certified Reliability Engineer (CRE) is a professional who understands the principles of performance evaluation and prediction to improve products/systems safety, reliability and maintainability. This body of knowledge (BOK) and applied technologies include, but are not limited to, design review and control; prediction, estimation, and apportionment methodology; FMEA; the planning, operation and analysis of reliability testing and field failures, including mathematical modeling; understanding human factors in reliability; and the ability to develop and administer reliability information systems for failure analysis, design and performance improvement and reliability program management over the entire product life cycle.” [https://asq.org/cert/]
ASQ and SMRP spun off corporations to manage their certifications businesses to protect their 501c(3) non-profit status. ASQ certifications are licenses to make money.
ASQ Certified Reliability Engineer (CRE): [https://asq.org/cert/reliability-engineer/], and https://asq.org/training/catalog/topics/reliability Book of Knowledge, describe reliability engineering including reliability engineering practices. The CRE exam questions could be answered with spreadsheet or simple statistics software. E. g., if MTBF is 122 hours and MTTR is 3 hours, what is Availability? Answer is supposed to be: 122/(122+3) = 97.5%. None involve nonparametric reliability estimation. ASQ sets a low bar.
SMRP Certified Maintenance and Reliability Professional: BoK, https://smrp.org/Certification/Certification-Offerings [ANSI]
https://www.lce.com/Reliability-Engineering-Certification-1721.html (not ASQ CRE!)
https://thinkclemson.com/programs/reliability-engineering-certification/ (Risk-Based Asset Mgt, Excellence, RCA, Predictive Maintenance courses)
Others claim reliability qualifications: Intel Edge AI certification, Berkeley ExedEd: AI Business strategies…, Northwester/Kellogg AI Applications, MIT X-Pro AI Certificate, and many more. All these certifications have at most a low level of statistical content.
Reliability Statistics Education?
I was a drafting-board engineer for the Saturn S-IV moon rocket booster. I asked why I had to list MIL-HDBK-105D (sampling QC) [MIL-STD-105 – Wikipedia] on every drawing? Ben Epstein was one of its authors. I was inspired to become a reliability engineer by Ben Epstein’s reliability course in the UC Berkeley IE&OR department. I took statistics courses for a minor. The UC Berkeley Statistics Department had great lecturers! My reliability career has been a result of my first thesis and the statistics I learned at UC Berkeley.
The RAND corporation In the 1960s adapted actuarial methods for forecasting engine demands based on the flying hour program (mission plans and flight durations). Actuarial rates are estimated from engine hours tracked by engine serial number and aircraft tail number. I learned actuarial methods while working for the US Air Force Logistics Command.
I used my first thesis to apply actuarial methods to all products and their service parts, without lifetime data. During the Gulf war, there was great shortage of AVDS-1790 tank diesel engines, because the actuarial methods didn’t deal with renewal process engine replacement counts [George, 2021]. I tried to show RAND US Army TACOM, and AFIT (Air Force Institute of Technology).
“Survival function” (biostatistics) is the same as a reliability function. Biostatisticians haven’t learned to estimate survival functions without lifetime data. The Lifetime Data Analysis journal, “An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data”, does not accept my articles, because my articles estimate nonparametric lifetime distribution functions from cases, infections, recoveries, and death counts.
No commercial statistical software estimates reliability or survival functions without lifetime data:
- Stata “survival analysis”
- SAS PROC LIFETEST, PROC RELIABILITY estimation from life data (right-censored, Weibull, MCF, parametric recurrence data, Kaplan-Meier from grouped life data. Nonparametric analyses in other PROCs require lifetime data.)
- JMP (predecessor MacSpin had nice 2-D projection for visualizing bivariate distributions)
- SPSS includes Survival Analysis, from life data
- R-Packages: Survival, Survminer, Gwasurvivr, KMSurv, (Klein and Moeschberger, not Kaplan-Meier), relsurv, survcomp, rpart, WeibullPP, SurvFit, and SurvPH require life data. NPMLE [Felthauser and George, 2021] and RenNPLSE [George, March 2021] do not!
- ReliaSoft: Weibull++, RBD,
- Mathematica: , Mathematica “EstimatedDistribution” and “SurvivalModelFit” functions,…
- The Kaplan-Meier nonparametric reliability estimator uses grouped life data, failure counts by installed base cohorts. Greenwoods’ asymptotic variance estimator is recommended, but it can be way off for finite population data [George, 2023].
Publication, Journals, Standards that DO NOT Publish Reliability Statistics Without Lifetime Data?
IEEE Trans. on Reliability (Ralph Evans, editor, rejected publications on field reliability saying, “Field data are garbage.”)
International Journal of Statistics and Reliability Engineering
Journal of Reliability and Statistical Studies
Safety and Reliability Journal
International Journal of Safety and Reliability
International Journal of Reliability, Risk, and Safety
Life Cycle Reliability and Safety Engineering
JRSS (Journal of the Royal Statistical Society) Series B
Reliability Engineering and System Safety
Naval Research Logistics (Quarterly) (Published our article on how to estimate the service-time distribution in an M/G/infinity self-service system in 1973 [George and Agrawal]. That is the basis for estimating reliability or survival functions for dead-forever products or humans.) NRL rejected the sequel.
ASQ Journal of Quality Technology
SIAM Review
NIST Engineering Statistics Handbook, chapter 8 “Assessing Product Reliability”
MIL-HDBK-217 Part Stress MIL-HDBK-217 Parts Count
Telcordia SR-332
217Plus 217Plus Part Count MIL-HDBK-338 B
ASTM E3159-21
ISO 3534-1
NSWC Mechanical ANSI/VITA 51.1 Part Stress ANSI/VITA 51.1 Parts Count
China’s GJB/z 299C Part Stress
References
Shao-kuan Chen, Tin-kin Ho, and Bao-hua Mao, “Component Reliability Estimations without Field Data”, Hong Kong Institution of Engineers, Transactions, vol. 14, No. 3, pp. 10-17, 2007
Jason Dickstein, “The Parts Traceability Puzzle,” Aviation Maintenance Magazine, The Parts Traceability Puzzle | Aviation Maintenance Magazine (avm-mag.com), April 2013
Ray Harkins and Mark Fiedeldey, “Reliability Engineering Statistics,” Udemy course, https://accendoreliability.com/reliability-engineering-statistics/
Fred Schenkelberg, “Fundamentals of Reliability-Related Standards,” podcast, https://accendoreliability.com/podcast/the-reliability-fm-network/fundamentals-reliability-related-standards/, 2018
Patrick D. T. O’Connor and Andre Kleyner, Practical Reliability Engineering, Fifth Edition. John Wiley & Sons, Ltd., 2012
References by George
L. L. George and Avinash Agrawal, “Estimation of a Hidden Service Distribution of an M/G/Infinity Service System,” Naval Research Logistics Quarterly, Vol. 20, No. 3, pp. 549-555, September 1973
Mark Felthauser and Larry George, “Npmle From Ships and Returns Counts by Max. Likelihood using Optimx,” LinkedIn, R-Forum, March 2021
“Estimate renewal pdf from ships and returns counts,” R-Script, March 2021
“Gozinto Theory and Parts’ Installed Base,” Weekly Update, Gozinto Theory and Parts’ Installed Base – Accendo Reliability, June 2022
“Renewal Process Estimation, Without Life Data,” Weekly Update,https://accendoreliability.com/renewal-process-estimation-without-life-data/#more-443057/, July 2021
“Covariance of the Kaplan-Meier Estimator,” Weekly Update,https://accendoreliability.com/covariance-of-the-kaplan-meier-estimators/#more-509950, June 2023
“Poll: Is Life Data Required…?, Weekly Update, https://accendoreliability.com/poll-is-life-data-required/#more-517690/, June 2023
“Statistical Reliability Control,” Weekly Update, Statistical Reliability Control? – Accendo Reliability, June 2023
André-Michel Ferrari says
This is an excellent initiative! Thanks for continuing to push it forward Larry. Reliability Statistics is a an important specialty that needs a solid foundation as in knowledge and a certification that crowns the knowledge. I for one would gladly strive to obtain this certification.
Larry George says
I hope no one is offended by the proposed Certification in Reliability Statistics. It is a tall challenge.
Triad Systems Corp. made millions of dollars selling auto parts’ demand forecasts and stock level recommendations to the automotive aftermarket. Triad was bought by a competitor because it did so well. It’s now part of http://www.epicor.com, and uses http://www.smartcorp.com and AI to make time series forecasts instead of reliability-based forecasts and stock level recommendations.
Did I apologize if your auto parts store didn’t have the part you needed in the 1990s? It could have been my fault.
Next article will explain how to estimate actuarial rates forecasts from periodic ships and returns counts, for dead-forever parts or products, not recurrent processes. At my job interview, the New Products manager described his regression model SUM[b(t)*n(t)] of parts’ demand. It suffered from autocorrelation. I explained that the regression coeffs. b(t) were actuarial rates and how to estimate them. He asked me, “What if the parts’ demands were not the first but for repairable systems? Panic over weekend and programmed it for renewal processes, SUM[d(t-s)*n(s); s=0,1,2,…,t]. There’s an article on that in http://www.accendoreliability.com. Let me know and I’ll send it.
Subhadip Sengupta says
The ASQ -CRE exam back in past used to evaluate reliability statistics and it was quite critical as a pass criteria. I did it 2012 with two attempts (first one I failed by 10 marks). Not sure why the current structure is not evaluating at that level. Probably to increase the pass rate and more revenue. I have observed ASQ CRE now a days with significant gap in reliability statistics knowledge and limited by the commercial software knowledge.
ASQ should enhance the statistics knowledge test.
Tammy says
This is an amazing piece. Makes one yearn to understand Reliability based on pure statistical data.
The link to the version of the Minnesota Cognitive Ability exam redesigned for reliability statistics might be broken. It would be nice to see what is available to benchmark current knowledge
Larry George says
Yes, I want to know field reliability using population data, because field reliability is what really happens and using population data eliminates sample uncertainty about the past.
Sorry, the link to the “Cognitive Reli-Ability Test” didn’t work. Try this link:
https://docs.google.com/presentation/d/1qASXws1NRuzWnQOYMUrALQqQJt7UK8MPdVX7vVt1c3Q/edit#slide=id.p1/
It’s in my Google Drive files, and supposed to be open to all readers. It was intended for your amusement, but I got carried away and asked some real questions for your consideration. At the end of the Cognitive Reli-Ability slide deck are one user’s answers and my answers, opinions, and attempted explanations. Email pstlarry@yahoo.com if you want more information about the Test questions.