Contingency Coefficient

A contingency table, as in the chi-squared test of independence, reveals if two sets of data or groups are independent or not. It does not reveal the strength of the dependence. The contingency coefficient is a non-parametric measure of the association for cross-classification data.

After calculating the chi-squared value from the contingency table exercise, we can use that value to determine the contingency coefficient with the following formula.

$$ \large\displaystyle C=\sqrt{\frac{{{\chi }^{2}}}{n+{{\chi }^{2}}}}$$

where

$$ \large\displaystyle {{\chi }^{2}}=\sum{\frac{{{\left( O-E \right)}^{2}}}{E}}$$

and, n = total sample size.

See the article on the $- \chi^2 -$ test of independence for information on the contingency table and calculations.

If C is zero of very near zero there is no association between the two groups. If the C value is closer to 1 there is a strong negative or positive association.

One of the disadvantages to the coefficient is it generally does not achieve 1 even if there is completely dependent on one another. You can determine the theoretical max C value with r (the number of rows and columns – which must be equal)

$$ \large\displaystyle {{C}_{\max }}=\sqrt{\frac{r-1}{r}}$$

For a two by two table, C has a possible maximum of 0.707.

Example

Using the value of chi-squared from the test of independence example of 7.5, which has a sample size of 40, we find

$$ \large\displaystyle C=\sqrt{\frac{7.5}{40+7.5}}=0.397$$

Which suggest a modest association between a person’s position and their opinion.

Kendall Coefficient of Concordance (article)

Chi-Square Test of Independence (article)

Spearman Rank Correlation Coefficient (article)

Comments

Mr. T says
April 1, 2020 at 6:10 AM
Great article.
This was probably a typo and I’m sure you knew this, but I just had a small correction that “C” is bounded between 0 and 1 instead of -1 and 1. The reason for this is that the way we’ve constructed our test statistic, e.g., (O-E)^2/E, it can never be negative and consequently “C” can’t ever be negative.
This also makes intuitive sense in that the Chi-Squared Test of Independence only really states there’s no association (the null) vs. there’s an association (the alternative) and doesn’t assign the direction of that association.
Otherwise, great piece!
- Fred Schenkelberg says
  April 1, 2020 at 9:56 AM
  Thanks for the note and the article is updated now
  Cheers,
  Fred
Bharati says
November 22, 2022 at 12:38 AM
How to find the maximum value of coefficient contengecy for 6*6 contengecy table?
- Fred Schenkelberg says
  November 22, 2022 at 7:59 PM
  Hi Bharati, not sure I understand the question. The value calculated from the data is used to determine if there are dependencies within the dataset. If the data has strong or many dependencies, I support the value would be rather high, something to experiment using simulated data to check, I suppose. cheers, Fred

Example

About Fred Schenkelberg

Comments

Leave a Reply Cancel reply