In chapter 4.1, Simple Linear Regression, we have looked at a linear correlation between two variables. In this chapter, we will look at multiple linear regression. This is when multiple indipendent variable x1,x2,..,xn impact on a dipendent variable Y.


In this case the equation is something like this:

Y=a + b1*x1 + b2*x2 + .. + bn*xn

where n is the number of the x variable.

Using more than one X in some cases helps find an equation that better represents your data but you need to respect some rule:

  • With these tools, Y is a parametric variable. Instead, the X can be Parametric or nominal;
  • Avoid multicollinearity among the X, which is a high correlation because if they are all correlated, you can avoid using all of, and instead you can use just one;
  • Avoid using too many X
  • Use enough data, with a single linear regression enough is about 30. You can add 15 observations for each variable;

You can easily plot the multiple regression with excel; for this reason, I avoid putting the formula there.

Share on: