Over the past few weeks, we have explored about 8 different hypothesis test formulas. There are more. So, how do you determine which test to perform? Well, that depends on the question you are trying to answer and the type of data you’re dealing with.
Types of Data
First, consider the type of data you have or will gather for the hypothesis test. If the data is variables data then you probably will use something based on the normal distribution. There are tests based on other distributions, and these are not commonly explored with the CRE. Be careful to verify your data is normal or normal enough to work with a t-test before making conclusions. Testing for normality is a subject of another post (to be written).
Sometimes the data is paired. Meaning the samples have some connection and are either treated differently. Like using different materials on the right and left shoes in a pair and asking people to wear the unique pairs when monitoring wear resistance during normal use. The shoes are the same design except for the one change in material and the people expose the paired shoes to the same stress. The paired t-test is the approach here.
If the data is discrete, then either binomial or Poisson are your friends. Binomial for proportions and on/off or pass/fail (Bernoulli Trials) type data. Poisson for counts of defects of faults, for example.
Types of Questions
Generally, we are interested in the mean for a comparison to a specification or known value. Sometimes we want to do a comparison of two population means. Another item of interest may be the exploration of changes in the spread of the population (variance).
For a complete study, beyond verifying the distribution of the data (normal, binomial, etc) we also either know the population variance or not. We also either consider the variances of two populations to be equal or not. In general, when comparing two populations check the variances to determine if they are equal or not, then test the means.
For example for variables data
If we know the population variance (which is not common, BTW) then we can use the Z-test. When we have to estimate the population variance from the sample, then use the t-test.
If we know the population or both populations variances there is no need to test if they are equal or not. When estimating variances from the sample we need to check as the t-test changes slightly when the variances are equal or not. When estimating the two population variances we can use the F-test for variances. When comparing a sample variance to a known population variance, the χ2 test works well.
When exploring two normal populations if the variances are equal us the pooled variance t-test, and when not equal then the Unequal variance t-test (of course).
Maybe a flow chart would be helpful –
There are many ways to test a hypothesis and each will depend on the data, the questions (hypothesis) and what is known or not known about the population and sample. Done well, you can create conclusions and make decisions. Done poorly you will have a number that may or may not be useful.
In the CRE, take care to determine which type of test first, then double check your calculations.
Related:
Paired-Comparison Hypothesis Tests (article)
Levene’s Test (article)
Hypothesis Tests for Variance Case I (article)
Suprasad Amari says
Fred, it’s excellent.
You may also consider prodiving a more details on these tests. I will read your other posts related to this topic.
Thanks
Sup
Fred Schenkelberg says
Hi Sup,
Glad you like the post and thanks for your kind words.
Please do check out the other posts as they typically have a worked out example. I’m also exploring a more interactive approach, yet may have to move the site to make that happen. Sill exploring though.
I know you know this material and if you’d like to contribute articles, that would be great. I’m sure those preparing for the CRE exam would appreciate a new voice once in a while.
cheers,
Fred