Before defining probability distribution and what a normal distribution is, we need to define some other thing.

In probability, we have:

• Experiment: what we want to test;
• Sample space: the set of the possible outcome;
• Event: a subset of the space;
• The probability function: is the function that will give you the probability of a certain event.

The basic probability can be calculated by dividing the number of events in the space by the total number of events in the sample space.

```Example:
Experiment1: roll of six on a 6 face dice;
Sample space1: {1, 2, 3, 4, 5, 6};
So P("roll 6") = 1/6 that is 0,16

Experiment2: roll of six on two 6 face dices ;
Sample space1: {(1,1),(1,2)..(1,6),(2,1),(2,2)..(2,6),..(6,6)} that is 36 in number;

So P("roll 6") = 5/36 = 0,14 becase all the combination that make six rolling two six face dices are: (1,5),(2,4),(3,3), (4,2), and (5,1)```

But how to use the probability in Lean six sigma?

First, you need to know that you can’t analyze all the output of a process in most cases, but you need to work on a representative sample of it. When you have this semple you can:

• Create descriptive statistics to study and better understand the nature of data;
• Use the correct probability distribution to determine the probability of an event;
• The probability of an event help to assess some question (like the number of non-conforming unit in a process) without analyzing the entire population;

We will look that the probability distribution will be helpful for the other chapter, for example, for the hypothesis testing.

This chapter is about normal distribution (also called bell distribution for its shape or gaussian distribution) and checking if a probability distribution is normal. For other distributions, you can read chapter 3.1.2 Classes of Distributions.

In image1 you can look at how a normal distribution appear if we plot on histogram.

You can check the normality of this distribution by looking at these few points on the graph:

• You have the max value in the middle of the distribution. This value is also the mean.
• From this max value, the data decrease.
• In a perfect normal distribution, the two sides of the mean are symmetric, but in reality, it’s ok even if it’s not perfectly symmetric (linke in image1).

You can also test the normality of a di distribution with:

• Chi-square goodness of fit hypothesis test explained in chapter 3.5.8 is only for categorical variable;
• Normal Probability plot that is the graph of the linear regression between your data and a normal distribution;
• Anderson Darling goodness of fit hypothesis test, that is based on the statistic on the image1.1. To know how to use it, I suggest studying chapters 3.5.x.

Other essential factor of the normal distribution are:

• It can have various bell forms because we can appear differently depending on the standard deviation and the mean of the value.
• In a perfect one, the 68 percent of data falls plus or minus one standard deviation from the mean (in example 1 we have the mean of 10, and standard deviation of 2, and we look that this area is from 8 to 12). The 95 percent of data falls plus or minuse two standard deviation from the mean. 99.7 percent of data falls plus or minuse three standard deviafion from the mena.

We can look ath the measure of the second point directly in the image2.

But what if we want to know how many percentages of observation are lower than a specific number, for example, 6? We can use the formula of the Z-score in the image3.