Solved – Package plm random effect residuals

panel dataplmrrandom-effects-modelstata

Simply put, I'd like to know how the plm package in R calculates the residuals of a random-effect regression.

I ask this because i'm getting some "weird" outputs. Let-me reproduce them here using the Grunfeld data for four firms, like Gujarati in his Basic Econometrics do:

require(plm)
require(foreign)

Grunfeld<-read.dta("Data.dta")
Grunfeld<-pdata.frame(Grunfeld,index = c("id","t"))

grun.re <- plm(Y~X2+X3,data=Grunfeld,model="random",index="id")

#Means by id
X2M<-tapply(Grunfeld$X2,Grunfeld$id,FUN = mean)
X3M<-tapply(Grunfeld$X3,Grunfeld$id,FUN = mean)
YM<-tapply(Grunfeld$Y,Grunfeld$id,FUN = mean)

#Random Effect: Fit the model and the calculate residuals "by hand"
fit.re<-grun.re$coefficients[1]+grun.re$coefficients[2]*Grunfeld$X2+grun.re$coefficients[3]*Grunfeld$X3
calcResid.re<-(Grunfeld$Y-fit.re)

#Random Effect:
head(cbind(grun.re$residuals,Grunfeld[,11:13],calcResid.re))

  grun.re$residuals   alphaRE       eRE        uRE calcResid.re
1         99.395803 -169.9282 116.23154  -53.69666    -53.69666
2         18.023715 -169.9282  34.85946 -135.06874   -135.06874
3        -39.256625 -169.9282 -22.42089 -192.34909   -192.34908
4         -2.857048 -169.9282  13.97869 -155.94951   -155.94951
5        -28.334107 -169.9282 -11.49837 -181.42656   -181.42656
6          6.475226 -169.9282  23.31096 -146.61723   -146.61723

In this table, uRE is the overall residual of the regression provided by Stata (which is identical to Gretl's) and calcResid.re is the manually calculated residuals from the fitted model. So, Stata, Gretl and I did the same. But what plm package do?

We can se that calcResid.re and uRE are equals. But the residuals provided by the plm estimation (grun.re$residuals) completely differs.

Here is a link to the dataset and results: https://github.com/rrremedio/shared_folder/blob/master/Data.dta

Best Answer

Thank you Helix. I expect don't breaking any code of politeness answering my own question. In fact, this question is related to this. Yet, I wil try give an answer from the econometrician point fo view now.

After long time, I realized that in a Random effects estimates you are running a demeaned regression as is said in equation 6 of the plm package paper here. However, I think their notation a litle "unrelated" to the rest of the paper.

Folowing Cameron and Trivedi, Microeconometrics Methods, the feasible GLS estimator can be implemented making OLS in the demeaned equation. That is Cameron & Trivedi 21.43 demeaned equation (which is the same as the cited above).

$$ y_{it}-\widehat{\theta}{\overline{y}_{it}}=(1-\widehat{\theta})\mu+(x_{it}-\widehat{\theta}{\overline{x}_{i}})'\beta+\upsilon_{it}$$

Where:

$$\widehat{\theta}=1-\frac{\sigma_\epsilon}{(T\sigma_\alpha^2+\sigma_\epsilon^2)^{1/2}}$$

The plm package calculates theta and stores it in the regression object.

And where, $$\upsilon=(1-\widehat{\theta})\alpha_i+(\epsilon_{it}-\widehat{\theta}{\overline{\epsilon}_{i}})$$

In a Random Effects model, the plm regression residuals are, in fact, the upsilon as above.

However, if we calculate the residuals by hand, u=Y-XB, we will obtain what Stata calls the overall error of the model. In a fixed effect model it is $$u_{it}=\alpha_{i}+\epsilon_{it}$$.

Where alpha is the individual especif effect and epsilon the idiosyncratic error. Once we obtain alpha of random effects by the shrinkage factor (as the cited related question does), we can recover the idiosyncratic error.

In summary, what plm package returns as the residuals from random effets model are the residuals of the OLS demeaned regression.

Wooldridge, Hsiao and Baltagi books about econometrics panel data derive the same result for the feseable GLS.