When nuclear power plants were built, companies had quality assurance programs and US Nuclear Regulatory Commission risk standards. Now the nuclear industry faces obsolescence. Qualifying replacement parts and replacing analog instrumentation and controls with digital systems generates some reliability testing work. NASA solicits unmanned nuclear power plants on the moon and Mars. Nevertheless, the demand for nuclear engineers is decreasing. Fortunately, the nuclear industry spawned risk analyses useful in other industries.
Many risk analyses represent failures with stress-strength probability models [http://reliawiki.org/index.php/Stress-Strength_Analysis]. Seismic risk analysis models failures as earthquake stress > component strength. [WASH-1400, 10 CFR Part 50 and others]. A simple model for seismic reliability is P[X > Y] is 1-Φ((EX-EY)/SQRT[VarX+VarY]), where Φ(.) denotes the standard normal cumulative distribution function. If stress X and strength Y are independent, then the difference X–Y will have a normal distribution with mean EX–EY and variance VarX+VarY.
Multivariate models are needed when stress and strength variables are dependent, when there are multiple failure modes, and when safety systems obscure how failure such as core melt could occur. Designers try to avoid “single-point failures” if one component could cause system failure. Usually more than one safety system must fail to cause reactor core damage. [George, Dec. 1998]
A simple multivariate failure model is 1-Φ((EX-EY)/SQRT[VarX+VarY-2*cov(X,Y)]), where Φ(.) denotes the standard normal cumulative distribution function. If X and Y have a joint normal distribution, then the difference X–Y will have a single-variate normal distribution, with mean EX–EY. If X and Y are dependent (correlated), then the variance of X–Y is VarX+VarY–2Cov[X,Y]. This model also works for lognormally distributed variables.
In multiple-component models, system failure is represented by a ”structure function” g(Stress,strength) that defines when multiple components cause system failure. Reliability is 1-P[g(Stress, strength)=failure], where stress and strength are multivariate random vectors, and g(.) is a “structure” function that defines failure.
Fault tree analysis describes system failure in terms of “cut sets”, intersections of component failure or success events. [See Howard Lambert’s downloadable references, https://ftaassociates.com/]. If any cut set occurs, then the system fails. The structure function g(Stress, strength) could be the union of cut sets or of “basic events” of the form Stress > strength. I.e., if all basic events in the cut set occur, this causes system failure. P[cut set] = P[ALL basic events occur] where basic events are usually earthquake Stress > component strength at failure. Earthquake response stresses are dependent due to their common origin, and a component’s strength-at-failure (“fragility”) may also be dependent on other components’ fragilities and perhaps on stresses.
This structure function cut-set representation is convenient because, if all cut sets were independent, then multivariate P[System failure] = P[C1 OR C2 OR…OR Ck] = 1-(1-P[C1])*(1-P[C2]) *… *(1-P[Ck]). However, cut sets may share some basic events so they’re not independent. The “minimal cut-set upper bound” says P[System failure] £ 1-(1-P[C1])*(1-P[C2])*… *(1-P[Ck]). A cut set is “minimal” if removal of one basic event makes it no longer a cut set (does not cause system failure).
Cut-set probability computations are multivariate integrals of joint probabilities of X–Y > 0 or X–Y £ 0 for vector elements. The vector X–Y has a joint normal distribution with mean vector E[X]–E[Y] and covariance matrix ΣX+ΣY–2*Cov(X,Y) where ΣX and ΣY are covariances of X And Y in matrix form, and Cov(X,Y) is the covariance of X and Y if any. (Don’t use principal components analysis to transform vector X–Y to independent elements, because limits of integration become weird.)
Table 1 shows failure probabilities of series and parallel systems under alternative population assumptions. The second row results occur when strength is one of several values and stress has the normal joint distribution with means 1000 and 6000, standard deviations 100 and 1000 and no correlation. The last row results occur when both stress and strength have normal joint distributions with the same normally distributed stress and table 2 strength parameters.
Table 1. Failure probabilities Pf
Parameter | Pf Series | Pf Parallel |
Discrete strength | .1283 | .0014 |
Normal strength | .1373 | .00623 |
Input real or subjective data?
Determine multivariate distributions from data, if you can. Preserve individual observations of multiple variables in vector form for estimating multivariate distributions. If normal distributions are reasonable, input means, variances, and covariances of strengths, stresses, and Cov(strength, stress) or their correlations. If lognormal distributions fit better, transform parameters to multivariate normal. If neither normal nor lognormal fit, copula version fits any marginal distributions and covariances. (Copula: “The formula that ruined Wall Street”. Peter Hoadley financial software, www.hoadley.net/options/develtoolscopulas.htm)
I found the marginal distributions of seismic stresses at component locations to have approximately lognormal distributions. (The US NRC is still funding articles on the assumption that seismic fragility functions have lognormal distributions [Dasgupta 2017].) It was easy to compute pairwise correlations of the natural logarithms of the component stresses. Pairwise correlations do not a correlation matrix make! You need a positive definite correlation matrix to do multivariate seismic risk analysis. Let me know if you want to find the nearest (Frobenius distance) positive definite correlation matrix to your matrix of pairwise correlations.
Correlations are required for seismic risk analyses, because stresses and strengths at failures are dependent. Post-earthquake inspections record damages and provide data for estimating seismic fragility functions and the correlations of strength-at-failures. Earthquake responses are vibrations, so fragility functions could be and have been estimated from tests such as shaker table data. However, testing individual components or structures will never quantify dependence or even correlation unless multiple units are tested simultaneously on the same shaker table. In early 1980s, Lawrence Livermore National Laboratory (LLNL) did multivariate seismic risk analysis of the Zion, Illinois nuclear power plant. LLNL had no data on seismic strengths-at-failures. Management said it would cost too much to run tests or get data from earthquakes, so LLNL paid “experts” for subjective opinions on fragility function parameters [NUREG/CR-3558]. I salted the expert opinion questionnaire with known strength parameters from “fragility” distribution data [Kecicioglu et al.]. They were experts in mechanical or civil engineering but not statistics. In the risk analysis, we assumed no correlation among strengths-at-failures and were later criticized for that assumption.
Unfortunately, correlation is not in the range of human intuition. Correlations can be elicited from subjective opinions on conditional probability distributions of strengths at failures [George and Mensing].
Since the LLNL expert opinion questionnaire, people have observed and collected earthquake damage data and estimated peak ground accelerations in the neighborhoods. [Anagnos, PG&E and SCE substations, and SQUG, nuclear power plant (or similar) components]. This earthquake damage data along and local PGAs are statistically sufficient to estimate seismic fragility functions and correlations, and to quantify sample uncertainty in the estimates [George (2015)]. The estimates differ substantially from the published parameter estimates based on subjective opinions and shaker table data. It’s time information derived from earthquake damage data replaced subjective opinions. Published parameter estimates underestimate standard deviations and compensate for uncertainty with “probabilistic risk analysis,” variance inflation, by randomizing standard deviation of mean or median strength, while totally ignoring sample uncertainty
Methods
The early 1980s LLNL Seismic Safety Margins Research Program (SSMRP) “…methodology is implemented by three computer programs: HAZARD, which assesses the seismic hazard at a given site, SMACS, which computes in-structure and subsystem seismic responses, and SEISIM, which calculates system failure probabilities and radioactive release probabilities, given (1) the [component earthquake] response results of SMACS, (2) a set of event trees, (3) a family of fault trees, (4) a set of structural and component fragility descriptions, and (5) a curve describing the local seismic hazard.” I wrote the SEISIM computer program, and later a spreadsheet and VBA version of the SEISIM computer program for AREVA. Computing multivariate P[g(Stress, strength)=failure] requires numerical integration [George and Wells (1981), George (1998)] or simulation [George (1982)].
SEISIM “Code for Seismic Probabilistic Risk Assessment” is still available for free [https://rsicc.ornl.gov/codes/psr/psr4/psr-453.html]. It computes upper bounds on the probabilities of unions of cut sets representing system failures and sensitivity to inputs representing randomness and uncertainty (derivatives). It does not compute risk, expected costs of failures, including the costs of variance and uncertainties.
The NRC is still funding articles on the assumption that seismic fragility functions have lognormal distributions [Dasgupta 2017]. Most seismic risk analyses refer to the fragility in terms of earthquake peak ground acceleration, which avoids the necessity of computing earthquake responses at the sites of components and the necessity of validating component fragilities by testing them on shaker tables. Nevertheless, that parameterization facilitates the use of earthquake data where the PGA is known along with the survival or failure of power station, distribution, and switchyard components.
Uncertainty and Probabilistic Risk Assessment (PRA)
Dick Mensing, a real statistician, told me, “Larry, when you give an estimate, you have to give some indication of its uncertainty.” To deal with uncertainty, we did sensitivity analyses; i.e., we computed the derivatives of event probability estimates with respect to distribution parameters: means, variances, and correlations. Sensitivity analyses could be used for prioritization and resource allocation according to bang-per-buck, (dP[Failure event]/dParameter)*(dParameter/d$$$), where $$$ could be spent to reduce probability of failure events. However, that didn’t happen.
SSMRP colleagues, nuclear industry consultants, the NRC [https://www.nrc.gov/about-nrc/regulatory/risk-informed/pra.html], and the International Atomic Energy Agency (IAEA) have done “Probabilistic Risk Assessments” to cover model uncertainty, lack of data, and mathematically convenient assumptions. Someone suggested randomizing the means of distributions to represent uncertainty. My reaction was, “You what?” Uncertainty analyses don’t compensate for laziness, unused data, and desire to exaggerate risk when nothing would be done.
PRA randomizes stress and strength distribution parameters to represent model uncertainty, the uncertainty regarding model assumptions, instead of getting data to reduce modeling uncertainty or instead of evaluating alternative models. PRA is useful for obtaining funding.
Typical PRAs randomize means but not variances, and certainly not covariances. Commercial software is available to randomize model parameters, assuming you can program the models. NASA and the SAE G11 PRA subcommittee are working toward PRA standards for potential application to aircraft design. Would you fly in aircraft designed using PRA?
An overview of PSHA (Probabilistic Seismic Hazard Analysis) says, “…to the benefit of just about no one, its simplicity is deeply veiled by user-hostile notation, antonymous jargon [An antonym has meaning opposite of another word.], and proprietary software.” [Hanks and Cornell] Their article describes PSHA statistically, so that the analysis can be understood by statisticians and perhaps others. PRA inspired http://pstlarry.home.comcast.net/Urc.htm, an article describing the absurdity of reducing uncertainty as an inverse function of dollars paid to consultants.
A lot of time, paper, and taxpayer money has been spent avoiding the necessity for multivariate analyses. Nuclear industry risk analysts prefer to avoid multivariate dependence in P[cut set] and P[g(Stress, strength)=failure] computations. EPRI (ASG) PRA recommends stress or strength correlation = 1.0. NUREG/CR 6268 method focuses on combinations of common causes represented by single-variate failure probability for cut set. One suggestion is a mixture of Independent and single failures: (1-α)P[X > Y]k+ αP[X > Y] [Fleming and Mikschi]. What if correlated X1 = … = Xk = X or Y1 = … = Yk = Y for all k? Then Φ(z1,…,zk,m,Σ) = Φ(z,m,σ2) with probability α of equal random variables. σ2= ((k-1)r+1)σ2/k for identically distributed Xj-Yj = Zj. This also works for linear Z1 = a+bZ2 and affine transformations. Then P[cut set] = (1-α)Φ(z1,…,zk,m,Σ)+αΦ(z,m,σ2) maybe (1-α)Φ(z,m,σ2)k+αΦ(z,m,σ2) if Z1-Zk are independent conditional on being unequal.
What to do when nuclear engineers become obsolete?
They could become reliability engineers specializing is Stress > strength reliability and risk analyses (as long as they do constructive sensitive and uncertainty analyses)! Analyses with alternative scenarios is common in financial risk evaluations.
Nuclear engineers could use multivariate stress-strength risk analyses for:
- Tornado damage, dam failures: stress > component strength,
- supply chain or inventory demand > supply: supply chains cannot cope with demands, invoice fill requires all demanded items,
- electrical load > generating or distribution network ecapacity: loss of load occurs if not all customers are served (LOLP=Loss Of Load Probability),
- financial consumption > income,
- automotive and aircraft safety and reliability, [Lambert]
- portfolio stress testing,
- COVID-19 cases exceed available PPE, ventilators and beds.
I will try to help and provide the software on request: pstlarry@yahoo.com.
References
10 CFR Part 50, Appendix B, Quality Assurance Criteria for Nuclear Power Plants and Fuel Reprocessing Plants, US NRC
IAEA TECDOC 1922, “Reliability Data for Research Reactor Probabilistic Safety Assessment,” Vienna 2020
Anagnos, Thalia, “Development of an Electrical Substation Equipment Performance Database for Evaluation of Equipment Fragilities,” April 1999, San Jose State Univ. and PG&E
Anonymous, “Uncertainty and Resource Allocation in the URC,” Applied Probability Newsletter, Ops. Res. Soc. of America, summer 1983, page 169 of https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxmaWVsZHJlbGlhYmlsaXR5fGd4OjZjNWViODczZTkzZmE2Mjg/
Biswajit Dasgupta, “Evaluation of Methods Used to Calculate Seismic Fragility Curves”, US NRC, May 2017
Karl Fleming and Thomas J. Mikschi, “Technical Issues in the Treatment of Dependence in Seismic Risk Analysis,” NEA/CSNI/R(99)28, 1999
L. L. George and R. Mensing, “Using Subjective Percentiles and Test Data for Estimating Fragility Functions,” Proceedings, Department of Energy Statistical Symposium, Berkeley, CA, Oct. 1980
L. L. George and J. E. Wells, “The Reliability of Systems of Dependent Components,” ASQC Quality Congress, San Francisco, Calif., May 1981
L. L. George, “Probability Computational Methods in SSMRP”, Lawrence Livermore National Laboratory, Livermore, Calif., UCID-18686, also “Statistical Sensitivity Analysis Methods in SSMRP”. “Super-Mann-Whitney Simulation of System Reliability,” Winter Simulation Conference, San Diego, CA, 1982. “Multivariate Mechanical Reliability”, ASQ Reliability Review, 1998. “Seismic Fragility Function Estimation”, Test Engineering and Management,Vol. 77, No 4, pp16-21, Aug. Sept. 2015
IAEA Safety Standards: “Seismic Design of Nuclear Installations Draft Safety Guide”, No. DS 490, circa 2016, lists a dozen!
D. B. Kececioglu, R. E. Smith, and E. A. Selstad, “Distribution of Strength in Sample Fatigue and Associated Reliability,” Ann. Reliability and Maintainability Conference, Detroit, MI, pp. 659-672, July 1970
H. Lambert, “Use of Fault Tree Analysis for Automotive Reliability and Safety Analysis”, UCRL-JC-154905, 2003, https://ftaassociates.com/
NRC standards: WASH-1400 – The Reactor Safety Study – The Introduction of Risk Assessment to the Regulation of Nuclear Reactors (NUREG/KM-0010), 1975; NUREG /CR-2015, Volumes. 1 and 2, “Seismic Safety Margins Research Program,” 1981; Seismic Safety Margins Research Program: Equipment Fragility Data Base (NUREG/CR-2680, UCRL-53038, Revision 1), 1983; NUREG/CR-3558
Larry George says
That funny symbol in P[System failure] £ 1-(1-P[C1])*(1-P[C2])*… *(1-P[Ck]) should have been P[System failure] <= 1-(1-P[C1])*(1-P[C2])*… *(1-P[Ck]). When am I going to learn not to use fawncy symbols?
Additional reference: "Probabilistic Risk Assessment Procedures Guide for
NASA Managers and Practitioners" NASA 2002. I wonder if the Webb space telescope will work?