Tolerance Intervals for Normal Distribution Based Set of Data

This is not the same as a confidence interval. For a mean or standard deviation, we can calculate the likelihood that the true parameter is within a range of values — confidence interval concerning a parameter.

A tolerance interval applies to the individual readings, not the statistics. The interval contains a certain proportion of the values within the distribution of individual data points. The endpoints are tolerance limits.

Another confusion is with engineering tolerances or tolerance limits.

These are the engineers desired range of manufactured output. The flange width may have a plus/minus specification on the drawing, for example. This may or may not be based on the statistical evaluation of the part variation and possible include the calculation of statistical tolerance interval.

Given a set of data what we either know or strongly suspect is normally distributed we can estimate the range of values that will contain a specific fraction of the individual values, with a specified confidence. This is more accurate than the Chebyshev Inequality, which doesn’t assume an underlying distribution.

Of course, there are tolerance interval calculations for other distributions and percentages, yet that is the subject of other articles.

Formula for two-sided normal distribution tolerance interval

There has been work to create viable estimates for tolerance intervals for some time. Starting with Wald and Wolfowitz (1946) and at last check by Odeh and Owen (1980). The NIST Engineering Statistics Handbook discusses tolerance intervals for a normal distribution and includes the following formula from Howe (1969).

$$ \large\displaystyle {{X}_{UL}}=\bar{X}\pm {{k}_{2}}s$$

Where X̄ and s are the sample mean and standard deviation, and
$- {{k}_{2}} -$is a factor which provides the approximation based on the proportion of the normal distribution desired to be contained within the tolerance interval and the confidence level for the estimate. The $- {{k}_{2}} -$ value calculation, also provided by Howe (1969) is

$$ \large\displaystyle {{k}_{2}}=\sqrt{\frac{\upsilon \left( 1+\tfrac{1}{N} \right)z_{{\left( 1-p \right)}/{2}\;}^{2}}{\chi _{1-\gamma ,\text{ }\upsilon }^{2}}}$$

Where, N is the sample size,
$- \chi _{1-\gamma ,\text{ }\upsilon }^{2} -$ is the critical value of the chi-squared distribution with
$- \upsilon -$ degrees of freedom with probability $- \gamma -$, and
$- z_{{\left( 1-p \right)}/{2}\;}^{2} -$ is the critical value of the normal distribution based on the cumulative probability $- {\left( 1-p \right)}/{2}\; -$.
p is the proportion of values that fall within the interval, where
$- \gamma -$ is the probability of that occurring (the confidence)

$- \upsilon -$ is the degrees of freedom used to estimate the standard deviation, s. Generally this is N-1.

An example problem

Let’s say we haves asked a capacitor vendor to create components that are approximately 5mm tall. We receive an initial batch of components and measure the height of 25 from the lot. We want to determine the interval that contains 90% of the capacitor heights with 99% confidence.

Based on the sample measurements we estimate the mean as 4.95mm and the standard deviation as 0.23mm.

Next we need to calculate the $- {{k}_{2}} -$ factor. Let’s follow the following five steps:

Calculate the normal distribution cumulative probability

$$ \large\displaystyle {\left( 1-p \right)}/{2}\;={\left( 1-0.9 \right)}/{2}\;=0.05$$

Calculate the $- \upsilon -$ degrees of freedom

$$ \large\displaystyle \upsilon =N-1=25-1=24$$

Look up the critical value for the normal distribution (note: later this value is squared)

$$ \large\displaystyle {{z}_{{\left( 1-p \right)}/{2}\;}}={{z}_{0.05}}=-1.645$$

Look up the lower critical value for the chi-squared distribution. Note: depending on your table you may need to use 0.99 to find the lower critical value. We are only excluding 0.01 of the lower tail in this case.

$$ \large\displaystyle\chi _{1-\gamma ,\text{ }\upsilon }^{2}=\chi _{0.01,\text{ 24}}^{2}=10.856$$

Calculate $-{{k}_{2}} -$. Note: some books contain tables for a range of probabilities and confidences.

$$ \large\displaystyle {{k}_{2}}=\sqrt{\frac{\upsilon \left( 1+\tfrac{1}{N} \right)z_{{\left( 1-p \right)}/{2}\;}^{2}}{\chi _{1-\gamma ,\text{ }\upsilon }^{2}}}=\sqrt{\frac{24\left( 1+\tfrac{1}{25} \right)\left( -1.645 \right)}{10.856}}=2.49$$

Thus the 99% confidence 90% tolerance interval ranges from 4.38mm to
5.52mm.

With more samples, this interval may decrease slightly, yet the decision now lays with the engineering team to determine if the range of capacitor heights is acceptable or not for the application in question.

—

Wald, A.and Wolfowitz J., “Tolerance limits for a normal distribution, Annals of Mathematical Statistics, 17 (1946) 208-15.

Odeh, Robert E and D B Owen. 1980. Tables for Normal Tolerance Limits, Sampling Plans, and Screening. New York: M. Dekker. Web.

Howe, W G. 1969. Two-sided tolerance limits for normal populations some improvements. Journal of the American Statistical Association 64 (326): 610-620.

Statistical Confidence (article)

Reliability with Confidence (article)

Root Sum Squared Tolerance Analysis Method (article)

Comments

Matti Vuotila says
January 15, 2018 at 10:23 PM
Nice to have a good, simple explanation and example on these basic statistics things, thanks.
But, hmmm …
I suppose that we just ended on having the k2 of value 2.49. When this is multiplied by the s (0.23) we have 0.573 and the range would become from 4.38 to 5,52 mm instead of 2.46mm to 7.44mm.
maybe ?
- Fred Schenkelberg says
  January 16, 2018 at 2:32 PM
  You are right and thanks for pointing out the error in the final calculation. I added/substract the k2 value not the k2 times the standard deviation, which is correct.
  Thanks for pointing out the error and your careful reading.
  Cheers,
  Fred
  PS: I’ve updated the example results
Anthony says
September 30, 2022 at 6:49 AM
Hi Fred,
In summary, Tolerance Interval is combination of Confidence Interval + Proportion.
But what is the logic of K2?
Is tolerance Interval kind of similar to Reliability?
Thanks,
Anthony
- Fred Schenkelberg says
  September 30, 2022 at 7:04 AM
  Hi Anthony,
  good question – keep in mind that a tolerance interval is related to individual values within the distribution. The interval is the range over which the actual (unknown true) values exist with a specific confidence.
  Reliability is a function of the distribution which provide the probability of survival over a specified duration (often from time zero).
  You could calculate the tolerance interval for a Weibull distribution for example and it would be a bit bigger than the nominal plot we would typically do… keep in mind that when we fit a distribution to data, it is an estimate of the actual distribution that the data comes from (and it remains unknown). The tolerance interval is where the actual inviduals points of the actual, unknown, distribution may occur.
  Tolerance intervals also do not only apply to distributions, one may calculate a tolerance interval about any statistic as well.
  cheers,
  Fred

Formula for two-sided normal distribution tolerance interval

An example problem

About Fred Schenkelberg

Comments

Leave a Reply Cancel reply