Solved – Multicollinearity and the intercept term with categorial variables

categorical datamulticollinearitymultiple regression

We're given a regression equation with two dummy variables which are perfectly collinear. $$ y_i = \beta_1 D1_i + \beta_2 D2_i + e_i$$ where $ D2_i = 1-D1_i$. Can we estimate this model using least squares?

I thought we couldn't because multicollinearity, but my teacher said we could because the model didn't have a intercept term. I don't understand the implication of having / not having an intercept term on whether or not we can run the regression. Here is what my teacher said:

Yes, you can estimate this model using least squares because there's no intercept term and hence running this regression won't lead to the problem of multicollinearity. It is equivalent to running two separate regressions $y$ on $D_1$ and $y$ on $D_2$ because $D_1$ and $D_2$ are orthogonal.

Best Answer

\begin{aligned} y_i &= \beta_1D1_i + \beta_2D2_i +e_i\\ &=\beta_2 + (\beta_1-\beta_2)D1_i+e_i \end{aligned}

In this sense, you will be able to uniquely estimate $\beta_2$ and $\beta_1 - \beta_2$ and hence $\beta_1$. You won't be able to do this if you specify a intercept term.

For a more formal discussion, you can google Estimability. Because I think your question is more about estimability, instead of multicolinearity.

Best Answer

Related Solutions

Solved – Sign flipping when adding one more variable in regression and with much greater magnitude

Solved – Panel Cointegration, Moderating Effects and Multicollinearity

Related Question