This is part of a short series on the common life data distributions.
The Binomial distribution is discrete. This short article focuses on 4 formulas of the Binomial Distribution.
It has the essential formulas that you may find useful when answering specific questions. Knowing a distribution’s set of parameters does provide, along with the right formulas, a quick means to answer a wide range of reliability related questions.
Assumptions
Given a count variable and if the following conditions apply then the binomial distribution is rather useful.
- There a fixed number, n, of observations
- The observations are independent
- The outcome of each observation is either success or failure
- The probability of success, p, is the same for each observation
The binomial distribution describes the count variable which is the result of n Bernoulli trials. The occurrence of successes are not ordered thus may occur at any point in the n trials. Thus the use of combinations and not permutations. This assumes replacement or essentially resetting the situation such that the probability, p remains constant.
If we need to assume without replacement consider using the hypergeometric distribution, instead.
Parameters
The number of trials, n, is fixed and discrete, n ∈ { 0, 1, 2, …, n }
The probability of success, p, also known as the Bernoulli probability parameter is likewise fixed and ranges 0 ≤ p ≤ 1
The count of success, k is a random variable and is count data, k ∈ { 0, 1, 2, …, n }
Probability Density Function (PDF)
When t ≥ 0 then the probability density function formula is:
$$ \displaystyle\large f\left( k \right)=\left( \begin{array}{l}n\\k\end{array} \right){{p}^{k}}{{\left( 1-p \right)}^{n-k}}$$
A plot of the PDF provides a histogram-like view of the time-to-failure data.
Cumulative Density Function (CDF)
F(t) is the cumulative probability of failure given k successes. Very handy when estimating the proportion of units that will fail over a warranty period, for example. If each trial represented the warranty period duration of stresses.
$$ \displaystyle\large F\left( k \right)=\sum\limits_{j=0}^{k}{\frac{n!}{j!\left( n-j \right)!}{{p}^{j}}{{\left( 1-p \right)}^{n-j}}}$$
The binomial CDF is a tedious set of calculations and without the benefits of modern computing power has been estimated using Poisson or Normal distribution approximations.
If n ≥ 20 and p ≤ 0.005, or if n ≥ 100 and np ≤ 10, you may use the Poisson distribution with μ = np
$$ \displaystyle\large F\left( k \right)\cong {{e}^{-\mu }}\sum\limits_{j=0}^{k}{\frac{{{\mu }^{j}}}{j!}}$$
If np ≥ 10 and np(1-p) ≥ 10 than the normal distribution provides a suitable approximation
$$ \displaystyle\large F\left( k \right)\cong \Phi \left( \frac{k+0.5-np}{\sqrt{np\left( 1-p \right)}} \right)$$
Reliability Function
R(t) is the chance of k successes. Instead of looking for the proportion that will fail the reliability function determine the proportion that are expected to survive.
$$ \displaystyle\large \begin{array}{l}R\left( k \right)=1-\sum\limits_{j=0}^{k}{\frac{n!}{j!\left( n-j \right)!}{{p}^{j}}{{\left( 1-p \right)}^{n-j}}}\\R\left( k \right)=\sum\limits_{j=k+1}^{n}{\frac{n!}{j!\left( n-j \right)!}{{p}^{j}}{{\left( 1-p \right)}^{n-j}}}\end{array}$$
Hazard Rate
This is the instantaneous probability of success for a given number of successes, k.
$$ \displaystyle\large \begin{array}{l}h\left( k \right)={{\left[ 1+\frac{{{\left( 1+\theta \right)}^{n}}-\sum\limits_{j=0}^{k}{\left( \begin{array}{l}n\\k\end{array} \right){{\theta }^{j}}}}{\left( \begin{array}{l}n\\k\end{array} \right){{\theta }^{k}}} \right]}^{-1}}\\\text{where}\\\theta =\frac{p}{1-p}\end{array}$$
Mark Liao says
Probably an example will be much helpful for a freshman (like me) to understand besides the formulas.