Solved – Within transformation in fixed effect regression model

fixed-effects-modelpanel dataregression

I am dealing with panel data model and in particular with the case of fixed effect (or Least Squares Dummy Variables, LSDV) model.

I have studied that $b_{LSDV}$ can be computed by appling OLS method to the usual equation $y=X\beta+D\alpha+\epsilon$, where D is a NTxN matrix of dummies and $\alpha$ represent an NTx1 vector of individual effects.

Now, I have found that another way to compute $b_{LSDV}$ is to apply the so called within transformation to the usual model in order to obtain a demeaned version of it, i.e. $M_{[D]}y=M_{[D]}X\beta+M_{[D]}\epsilon$.

My question is which is the difference between the two models? I've read that the second one is the most used by econometric softwares; is it true? Why?

Best Answer

The two are equivalent.

The second version uses the Frisch-Waugh-Lovell theorem which says that you can compute a subset of regression coefficients of a regression (here, $\hat\beta$) by (1) regressing $y$ on the other regressors (here, $D$), saving the residuals (here, the time-demeaned $y$ or $M_{[D]}y$, because regression on a constant just demeans the variables), then (2) regressing the $X$ on $D$ and saving the residuals $M_{[D]}X$, and (3) regress the residuals onto each other, $M_{[D]}y$ on $M_{[D]}X$.

The second version is indeed much more widely used, because typical panel data sets may have thousands of panel units, so that the first approach would require you to run a regression with thousands of regressors, which is not a good idea numerically even nowadays with fast computers.