Edited by John Healy
There are a number of different methods to calculate confidence intervals for a proportion. The normal approximation method is easy to use and is appropriate in most cases.
Clopper and Pearson describe the Clopper-Pearson method also called the exact confidence interval and we’ll describe it in a separate article.
There are other methods, which again will find a description in separate articles.
The normal approximation method is appropriate when both
$$ \large\displaystyle \begin{array}{l}np&>5\\n\left( 1-p \right)&>5\end{array}$$
Where n is the number of items in the sample
And, p is the proportion of ‘successes’ over n.
If the data does not meet this set of criteria then do not use the method.
Successes are defined generally by convention or convince. For example, when determining the proportion of functional (flawless) bolts in a batch, we can inspect and count the good bolts. Thus the ratio of good bolts (successes) over the total number of bolts is a proportion. We could also define a ‘success’ as a faulty bolt, thus determining the proportion of faulty bolts or a defective proportion.
Normal Approximation Method Formula
The formula for a two-sided confidence interval is
$$ \large\displaystyle p\pm {{z}_{\tfrac{\alpha }{2}}}\sqrt{\frac{p\left( 1-p \right)}{n}}$$
Where, $latex alpha $ is one minus the confidence
And, z value is the probability from the z-table corresponding to the desired confidence. For example, for a two-sided 95% confidence interval, $- \alpha/2=0.05/2=0.025-$ thus the z value is $- {{z}_{\alpha/2}}-$ is 1.96.
For a one-sided lower confidence interval use
$$ \large\displaystyle p-{{z}_{\alpha }}\sqrt{\frac{p\left( 1-p \right)}{n}}$$
And for a one-sided upper confidence interval use
$$ \large\displaystyle p+{{z}_{\alpha }}\sqrt{\frac{p\left( 1-p \right)}{n}}$$
Note the use of $- \alpha -$ instead of $- \alpha /2-$.
An Example
Let’s say we have a sample of 100 bolts from a mixed bin of thousands of bolts containing grade 2 and 5 bolts, n = 100. And, through inspection, we find 58 that are grade 2 and 42 are grade 5. What is the estimated proportion of grade 2 bolts with a 95% confidence interval?
Check the assumption of $- np>5 -$ and $- n\left( 1-p \right)>5 -$ first.
$$ \large\displaystyle \begin{array}{*{35}{l}}
np=100\left( .58 \right)=58&>5 \\
n\left( 1-p \right)=100\left( 1-.58 \right)=42&>5 \\
\end{array}$$
Thus both criteria are met so we can use the normal approximation method.
For a two-sided 95% confidence interval the area under the tail of the normal distribution is $- \alpha/2=0.05/2=0.025 -$ and we use the standard normal table to find the z value. In this case, it is 1.96.
And, we know the sample provided a count of 0.58 grade 2 bolts. Thus we can calculate the confidence interval with
$$ \large\displaystyle \begin{array}{l}p\pm {{z}_{\tfrac{\alpha }{2}}}\sqrt{\frac{p\left( 1-p \right)}{n}}\\0.58\pm 1.96\sqrt{\frac{0.58\left( 1-0.58 \right)}{100}}\\0.58\pm 0.14\end{array}$$
Or, based on the sample we expect with 95% confidence that the actual proportion of grade 2 bolts in the large bin of thousands of bolts is somewhere between 0.44 and 0.72.
Related:
Tolerance Intervals for Normal Distribution Based Set of Data (article)
Hypothesis Tests for Proportion (article)
Two Proportions Hypothesis Testing (article)
Leave a Reply