Solved – Adding a quadratic variable to regression

econometricsmultiple regressionregression

In my textbook I often see a quadratic term in the regression. For example let's say I have the regression model: $\log(\mathrm{wage})=\beta_0+\beta_1\mathrm{educ}+\beta_2\mathrm{exp}+\beta_3\mathrm{exp}^2+\varepsilon$. Why is it so common to put the squared term in there? Here's a quote from my book:

It often makes sense to add quadratic terms of any significant variables to a model.

Why is that? And if this is true, then if I'm studying endogeneity and a quadratic variable isn't in there is it likely to be that case the the squared term is endogenous if it's omitted?

Best Answer

Often the relationship between y and x is nonlinear. There are a variety of solutions. One solution is to add polynomial terms and the first one to look at is usually $x^2$. But you should first look at a scatterplot of x and y; you should also look at the residuals from the linear model without the quadratic term. But it turns out that many relationships are pretty well fit by $y \sim b_0 + b_1x + b_2x^2$ (plus any other x variables, of course).

It is also possible to add cubic, quartic and even higher order terms, but such models quickly become hard to interpret. Another possibility is to look at a spline regression.

I don't really understand your last bit about endogeneity. A variable can be endogenous or exogenous to a model, but that doesn't seem to relate to this ... at least, not in a way that's clear to me.

Related Question