In Hypothesis testing we have two mutual hypotesis:
- H0 or null hypothesis: is the one where we found that the data fall in the expected region;
- H1 or alternative hypothesis: is the one where we found that the data fall outside the expected region. In this case something occurs (and it is what we want);
We test if something occurs, and (if yes) we can reject the null hypothesis by looking at the data. And we can say that with a certain level of accuracy.
You can use hypothesis testing for:
- Testing if the data fits a data model. For example, we can see if the data have a normal distribution;
- Look if something changes. For instance, we made a change in the process, and we want to see if this is a real improvement.
- Comparing statistics to a hypothesis about the population.
The main step to make an hypothesis are:
- Formulating a hypothesis about the population;
- Determine the significance level;
- Collecting the sample from the population;
- Calculating the statistics based on the sample;
- Accepting or rejecting the statistics based on the acceptance criteria.
Example: Suppose that you want to test if a coin is fair. If we make 100 flips and get 87 heads, can we say it's unfair? And if we get 53 heads? Some people can say that 87 is unfair, but 56 is only a normal variance, but what about 60? The Null hypothesis test help to answer this question quantitatively.
If you look at the example above, you can identify these element of the hypothesis testing:
- Null hypothesis H0: The coin is fair, so the probability of heads is 0.5;
- Alternative hypothesis H1 (also called Ha): The coin is not fair, so the probability of heads isn’t 0.5. Maybe you can say all the number of heads that is 0,1,2 and 8,9 and 10;
- Test statistics: X, number of heads on 10 flips;
- Null distribution: the probability distribution based on the null hypothesis, in this case, is a binomial with 0.5 of probability;
- Rejection region: We expect to get about 5 heads on 10 flips under the null hypothesis.
We will look in the next chapter for more information about the hypothesis testing, like the type of error you can make working with it and the confidence level of the result. It is also essential to know the correct type of test to use in a different situation.