Before talking about the Central Limit Theorem, we need to introduce the Law of the large number.
If we make the average of a sample, this average comes near the mean of the entire population with the growth of the sample number. This is because when we increase the number of samples, the random error of the single sample becomes less influential on the mean.
In other words, we can say that the average value of many independent samples is close to the mean of the distribution of the starting population. This is the law of large number (LOLN)
If we have X1, X2, .., Xn indipendent random variables with the same distribution the same mean m and the same standard deviation sigma. Then if we have Xmean = (X1 + X2 + .. + Xn)/n , then Xmean is another random variables where:
- If n grows (n > 30), Xmean have near to the mean m; (Low of Large Number)
- If n grows (n> 30), Xmean converge to a normal distribution with mean m and standard deviation sigma/2 (this is the Central Limit Theorem);
In other words, we can say that the mean of all the samples, if we get a good number of samples, is centered on the mean of the real population. In addition, the distribution of this sample has a normal distribution.
Example: We experimented with tossing a coin 20 times, and we wanted to plot the number of heads. We know that the starting distribution is binomial with a probability of 0.5. So we make a first experiment e we have only 7 heads. We make a second one, and we get 11 heads. We continue until 30 experiments, and at the end, we create a bar chart where every bar is the number of times you have one head, two heads, and so on. We will look that this bar depicts approximately a normal distribution (2) with a means of roughly 10 that is near to the mean of the initial binomial distribution (1).
The Central Limit Theorem implies that working on average on small samples, we can use the normal distribution to evaluate the output of any process. And with the normal distribution, we have a lot of tools.