Here’s an overview of the non-parametric test to evaluate if a set of samples have the same variance. If the variances are equal they have homogeneity of variances.
Some statistical tests assume equal variances across samples, such as analysis of variance and many types of hypothesis tests. It is also assumed for statistical process control purposes to determine stability (often done with range (r chart) or standard deviation (s charts).
The Bartlett Test also is useful for tests of homogeneity of variance, yet does rely on the underlying distribution to be normally distributed. If there is evidence the data is normal, then the Bartlett Test has better performance.
The F test is another common method to determine if variances are equal and also relies on the data coming from a normal distribution.
Thus, if the data is not normally distributed, then Levene’s Test is the one to use.
The sample sizes do not need to be equal, which is convenient.
Test Process
- The null hypothesis is
$$ \large\displaystyle {{H}_{o}}:\sigma _{1}^{2}=\sigma _{2}^{2}=\ldots =\sigma _{k}^{2}$$
- The alternative hypothesis is
$$ \large\displaystyle {{H}_{a}}:\sigma _{1}^{2}\ne \sigma _{2}^{2}\ne \ldots \ne \sigma _{k}^{2}$$
For at least one pair (i,j).
- The Levene test statistic, W, is defined when there is a variable, Y, with sample of size N divided into k subgroups, where Ni is the sample size of the ith subgroup, then
$$ \large\displaystyle W=\frac{\left( N-k \right)}{\left( k-1 \right)}\frac{\sum\nolimits_{i=1}^{k}{{{N}_{i}}{{\left( {{{\bar{Z}}}_{i.}}-{{{\bar{Z}}}_{..}} \right)}^{2}}}}{\sum\nolimits_{i=1}^{k}{\sum\nolimits_{j=1}^{{{N}_{i}}}{{{N}_{i}}{{\left( {{{\bar{Z}}}_{ij}}-{{{\bar{Z}}}_{i.}} \right)}^{2}}}}}$$
Where the basis for the Zij term could be mean, median or trimmed mean as follows:
$$ \large\displaystyle {{Z}_{ij}}=\left| {{Y}_{ij}}-{{{\bar{Y}}}_{i.}} \right|$$
Where Ȳi. is the mean of the i-th subgroup.
$$ \large\displaystyle {{Z}_{ij}}=\left| {{Y}_{ij}}-{{{\tilde{Y}}}_{i.}} \right|$$
$- {{{\tilde{Y}}}_{i.}} -$ is the mean of the i-th subgroup.
$$ \large\displaystyle {{Z}_{ij}}=\left| {{Y}_{ij}}-\bar{Y}_{i.}^{‘} \right|$$
Where Ȳ’i. is the mean of the i-th subgroup.
$- {{{\bar{Z}}}_{i.}} -$ are the group means of the $- {{{Z}}_{ij}} -$ and $- {{{\bar{Z}}}_{..}} -$ is the overall mean of the $- {{{Z}}_{ij}} -$.
The original paper by Levene used means. Later developments by Brown and Forsythe (1974) extended the test to use the median which is best when the underlying data followed a skewed distribution. The trimmed mean approach is best when the underlying data followed a heavy tail distribution (Cauchy). The mean approach is best when the data is symmetric and moderated-tailed.
Critical Region
Based on a significant of alpha if the test statistic, W, is greater than Fα,k-1,N-k where Fα,k-1,N-k is the upper critical value of the F distribution with k-1 and N-k degrees of freedom at a significance level of α.
Related:
Hypothesis Test Selection (article)
Hypothesis Tests for Variance Case I (article)
Two samples variance hypothesis test (article)
Leave a Reply