A contingency table, as in the chi-squared test of independence, reveals if two sets of data or groups are independent or not. It does not reveal the strength of the dependence. The contingency coefficient is a non-parametric measure of the association for cross-classification data.
After calculating the chi-squared value from the contingency table exercise, we can use that value to determine the contingency coefficient with the following formula.
$$ \large\displaystyle C=\sqrt{\frac{{{\chi }^{2}}}{n+{{\chi }^{2}}}}$$
where
$$ \large\displaystyle {{\chi }^{2}}=\sum{\frac{{{\left( O-E \right)}^{2}}}{E}}$$
and, n = total sample size.
See the article on the $- \chi^2 -$ test of independence for information on the contingency table and calculations.
If C is zero of very near zero there is no association between the two groups. If the C value is closer to 1 there is a strong negative or positive association.
One of the disadvantages to the coefficient is it generally does not achieve 1 even if there is completely dependent on one another. You can determine the theoretical max C value with r (the number of rows and columns – which must be equal)
$$ \large\displaystyle {{C}_{\max }}=\sqrt{\frac{r-1}{r}}$$
For a two by two table, C has a possible maximum of 0.707.
Example
Using the value of chi-squared from the test of independence example of 7.5, which has a sample size of 40, we find
$$ \large\displaystyle C=\sqrt{\frac{7.5}{40+7.5}}=0.397$$
Which suggest a modest association between a person’s position and their opinion.
Related:
Kendall Coefficient of Concordance (article)
Chi-Square Test of Independence (article)
Spearman Rank Correlation Coefficient (article)
Mr. T says
Great article.
This was probably a typo and I’m sure you knew this, but I just had a small correction that “C” is bounded between 0 and 1 instead of -1 and 1. The reason for this is that the way we’ve constructed our test statistic, e.g., (O-E)^2/E, it can never be negative and consequently “C” can’t ever be negative.
This also makes intuitive sense in that the Chi-Squared Test of Independence only really states there’s no association (the null) vs. there’s an association (the alternative) and doesn’t assign the direction of that association.
Otherwise, great piece!
Fred Schenkelberg says
Thanks for the note and the article is updated now
Cheers,
Fred
Bharati says
How to find the maximum value of coefficient contengecy for 6*6 contengecy table?
Fred Schenkelberg says
Hi Bharati, not sure I understand the question. The value calculated from the data is used to determine if there are dependencies within the dataset. If the data has strong or many dependencies, I support the value would be rather high, something to experiment using simulated data to check, I suppose. cheers, Fred