A common assumption when comparing three or more normal population means is they have similar (the same) population variances.
ANOVA and some DOE analysis results rely on the underlying data having similar variances. If this assumption is not true, the conclusions suggested by the ANOVA or DOE may be misleading.
It doesn’t take long to check.
Test for homogeneity of variance
M. S. Bartlett in 1937 published the paper Properties of Sufficiency and Statistical Tests.
The process to check if samples drawn from possibly different populations have equal variances later became known as Bartlett’s Test.
This is particularly useful when comparing population means and making the assumption the related variances are equal.
Bartlett’s Test is accomplished using the structure of a hypothesis test. Setting up the null and alternative hypothesis, calculating test statistic and comparing to a critical value to make a conclusion.
Note: Bartlett’s test is not robust from departures from normality, thus test for normality first. A robust test for homogeneity of variance to consider is the nonparametric Levene’s Test.
Barlett’s Test step-by-step
Hypotheses
The null hypothesis is all the population variances (k populations being compared) are equal.
$$ \displaystyle\large {{H}_{0}}:\sigma _{1}^{2}=\sigma _{1}^{2}=\ldots =\sigma _{k}^{2}$$
The alternative hypothesis is the population variances are not all equal. This means at least one is not equal to the others. The test does not explicitly determine which one is different, just that at least one is different.
A box plot may help identify the offending sample variance.
Test statistic
For each population, draw a sample of size ni from the i-th population where i = 1, 2, … , k. Calculate the sample variance from each sample.
Later we will need the degrees of freedom, υi. The i-th sample has υi = ni – 1 degrees of freedom. Overall for this test the degrees of freedom is
$$ \displaystyle\large \upsilon =\sum\limits_{i=1}^{k}{{{\upsilon }_{i}}}$$
The combined sample variance is
$$ \displaystyle\large {{s}^{2}}=\frac{\sum\limits_{i=1}^{k}{{{\upsilon }_{i}}s_{i}^{2}}}{\upsilon }$$
Finally, we can calculate the test statistic, M, which Bartlett determined as
$$ \displaystyle\large M=\upsilon \ln {{s}^{2}}-\sum\limits_{i=1}^{k}{{{\upsilon }_{i}}\ln s_{i}^{2}.}$$
Bartlett recognized the test statistic is bias and suggested dividing M by
$$ \displaystyle\large C=1+\frac{1}{3\left( k-1 \right)}\left[ \left( \sum\limits_{i=1}^{k}{\frac{1}{{{\upsilon }_{i}}}} \right)-\frac{1}{\upsilon } \right].$$
Thus, use the corrected test statistic M / C.
Critical value
Bartlett showed that M is approximately distributed as χ2k-1. This approximation is suitable when each sample, ni, has at least 5.
The critical value is
$$ \displaystyle\large \chi _{1-\alpha ,k-1}^{2}$$
With 1-α confidence and k-1 degrees of freedom.
Conclusion
If M/C greater than the critical value reject the null hypotheses at the alpha significance level and conclude at least one population variance is different from the others.
If M/C is less than or equal to the critical value, conclude there is insufficient evidence to reject the null hypotheses.
This doesn’t mean the variance is actually equal, we just do not enough data to prove that at least one is different.
Usman Rayyanu Dabai says
Thanks for this lit info. It helps alot.
I will be much grateful if you can get back to me, on how use the bartlette procedure when the sample size is extremely large.
Thanks to you, in anticipation.
Fred Schenkelberg says
Hi Usman,
Glad to hear the article is useful. For large sample sizes, no change that I know of to the procedure. What is it you are having trouble with when checking large populations? Be sure to check for normality first.
Cheers,
Fred
Ramses says
Is Barlette’s test used as a replacement for ANOVA or are they normally used together?
Fred Schenkelberg says
If there is any doubt, or as a general practice, check using Bartletts test that the variances are homogeneous (or similar) before continuing the analysis using DOE or ANOVA. Bartlett is not a substitute for ANOVA, it is one method to check one of the common assumptions made when using ANOVA.
Cheers,
Fred
Ruth Palsson says
Hello,
How robust is this test for sample sizes of four groups of 10?
Thanks,
Ruth
Fred Schenkelberg says
Hi Ruth,
I do not know. Try running a simulation using variations in values to see if the technique consistently picks up on the differences or not.
cheers,
Fred
Ruth Palsson says
Thanks Fred,
Ruth
Jimmy bredal Wolfsen says
Can you refer to some rigid calculation in the matter of M being bias and M/C is unbiased. I got an asignment where i need to describe the theory behind bartlett and so-called adjustment factor, which i assume is simular to the correction.
Thanks Jimmy
Fred Schenkelberg says
Hi Jimmy,
sorry, not aware of any such rigid calculations
cheers,
Fred