As with the Markov Inequality, we may find useful information from a list of values, say time to failure data. Again, none of the numbers may be negative for this to apply, yet with life data that is rarely the case.
Short on time- a common situation for reliability engineers; we have only the mean, standard deviation and number of values in a list. And, we need to say something about the data and the number or fraction of value above a specific value.
If the mean of a list of numbers is M and the standard deviation of the list is SD, then for every positive number k, the fraction of numbers in the list that is k x SD or farther from M ≤ 1 / k2
This implies that not too many of the values in the list can be too far away from the mean. This does not imply a particular distribution and it works for very irregular shapes (histogram) of the data.
A quick calculation shows that the largest faction of values 2 standard deviations from the mean is ≤ 1 / k2 is ≤ 1 / 4 or 25% of the list values could be 2 or more SD from the mean. 3 SD ≤ 11.11%, and so on.
Example Problem
Let’s say we have time to failure data and have the mean and standard deviation only. The mean is 140 days with an SD of 30. What fraction of times to failure is between 90 and 190 days?
Solution
We can find a lower bound, not an exact answer, using Chebyshev’s inequality. The range is plus or minus 50 days from the mean and is one and two-thirds the standard deviation. Thus we are interested in 1 minus the fraction beyond 1.6667 SD from the mean.
1 / ( 1.6667 )2 = 0.36 = 36%
100% – 36% = 64%, meaning at least 64% of values are within 50 days of the mean.
This is handy when the data does not follow a commonly used life distribution, like lognormal or Weibull, yet we need to estimate information from the data. Just another handy tool from the world of statistics. Enjoy.
Related:
Markov Inequalities (article)
Variance (article)
The Normal Distribution (article)
Leave a Reply