Solved – How exactly does the PLM package in R create lags

panel dataplmrstata

I'm trying to understand the difference between XTREG and PLM. First, I have looked at this answered question:

Difference between fixed effects models in R (plm) and Stata (xtreg)

But when I try the code provided by the answerer, I get different answers from R and Stata. The STATA results match those of the answerer, but the R results do not.

I have an inkling for why. When I execute that code in R, R doesn't create lags within the grouping variable, it creates lags overall. For example, if there are 50 states and 17 years, when including a lag in the regression, I will lose 50 observations: the first year for each state. In STATA, the sample size reduces accordingly. In R, the sample size reduces by 1. This is because its not identifying the "state" grouping variable. So, does anyone have an idea of what is going on here?

Best Answer

The result of the error I was getting was due to dplyr being active. Once I detached this library, the code provided in plm matches that of xtreg.

Related Solutions

R – How to Use the ‘Within’ Model with plm Package for Panel Data Analysis

The two estimators are computed differently, but are numerically identical, so essentially it doesn't matter. The within estimator is computationally easier since it keeps the size of the design matrix in check, and I would think that is how the within estimator is implemented. Here is some R code to demonstrate this

library(plm)
data("Produc", package = "plm")
plmResults <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp, data = Produc, 
                  index = c("state","year"))
summary(plmResults)

regResults <- lm(log(gsp) ~ as.factor(state) + log(pcap) + log(pc) + log(emp) + unemp, 
                 data = Produc)
summary(regResults)

Or, if you prefer, some Stata code,

webuse nlswork
xtset idcode

xtreg ln_w grade c.age##c.age c.ttl_exp##c.ttl_exp c.tenure##c.tenure ///
 2.race not_smsa south, fe

areg ln_w grade c.age##c.age c.ttl_exp##c.ttl_exp c.tenure##c.tenure ///
 2.race not_smsa south, absorb(idcode)

A proof using the Frisch-Waugh-Lovell theorem can easily be given. Note one crucial point that for a large number of groups, that is, $n\to \infty$, the estimates of the coefficients on the group dummies are not consistent.

Solved – AIC/BIC values keeps falling as I add more and more lags. How to select the appropriate lag length

Firstly 50 lags is too much. What kind of data are you modeling? Secondly, there is a problem with your code: the D.X must start at 0 not 1. And You wrote L(1).X twice.
You can use this
forval i=1/50{
forval j=1/50{
regress D.Y L(1/i').D.Y L(0/j').D.X L(1).Y L(1).X
estimates store est_i'_j'
}
}
estimates stats est_*,n(251)

Best Answer

Related Solutions

R – How to Use the ‘Within’ Model with plm Package for Panel Data Analysis

Solved – AIC/BIC values keeps falling as I add more and more lags. How to select the appropriate lag length

Related Question