Solved – When is it better to use Multiple Linear Regression instead of Polynomial Regression

bias-variance tradeoffmodel selectionmultiple regressionpolynomialregression

In the course I've just learnt Multiple Linear Regression and Polynomial Regression.

Why would you ever use Multiple Linear Regression when Polynomial Regression will always fit the data better?

Best Answer

First of all note that polynomial regression is a special case of multiple linear regression.


Let's consider three models:

Model1

$Y = \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_3 + \epsilon$

and

Model2

$Y = \beta_1 x_1 + \beta_2 x_1^2 + \beta_3 x_1^3 + \epsilon$

and

Model3

$Y = \beta_1 x_1 + \beta_2 x_1^2 + \beta_3 x_1^3 + \beta_4 x_2 + \beta_5 x_3 + \epsilon$


Of course Model3 would explain the most deviation in the data. However if you have overfitting you might go for either Model1 or Model2. Also if some of the five $\beta$s are not significant you can exclude them. If there is no non-linearity you go for Model1. If only one explanatory variable shows significant impact, but this variable has a non-linear relationship with the explained variable you go for Model2.


You can use variable selecton in order to check with variables are relevant for the model. One famous variable selection algorithm is Boruta, but you can also do variable selection with AIC and BIC.

Related Question