Solved – Running fixed-effects model stata

categorical datafixed-effects-modelstata

I'm running a fixed effects model. My independent variable is store presence, and one of my dependent variables, i.county, measures county fixed effects.

I use xi: regress store_presence i.county other_var othervar2 where county is the US county code, a string variable.

But the regression output reports that some variables "have been omitted due to collinearity".

What should I do to fix this error and capture county-fixed effects in my model?

Best Answer

The fixed effects model uses the within estimator which after adjustments yields same results as LSDV (least squares dummy variables). The within estimator demean each variable by the group means (and adds the global mean in order to "fix" the intercept such that predictions are center around the response variable mean). If county is the panel identifier (PID) set in xtset PID time, then those are already accounted for. The estimates of the PID effects are not consistent so you should not display them anyhow (the within is preferred for computation efficiency, but this argument is noteworthy). If county is a different identifier (PID is individual and you want to control for their county) one can include them in the regression or cluster at the highest dimension. Instead of xtreg y x, robust do xtreg y x, cl(county). Stata allows for encoded variables (categorical variables with efficient storage and functionality, but with labels). Any string variable that should be considered categorical should be encoded as such: encode Strvar, gen(var).

Related Solutions

Solved – Difference between fixed effects models in R (plm) and Stata (xtreg)

Welcome to the site, @gwatson! You are right that effect = "twoways" sets up both "individual" and "year" effects.

I tested with Produc data from R package plm and found the main results are the same (see the codes and outputs below). The only apparent difference I found is the year effect, which is caused by contrast (xtreg sets the first year as reference, while plm directly estimates the effect for each year).

## R code
data("Produc", package = "plm")
zz <- plm(gsp ~ unemp + lag(gsp), data = Produc, index = c("state","year"), method = "within", effect = "twoways")
summary(zz)

## plm output
Coefficients :
            Estimate  Std. Error  t-value  Pr(>|t|)    
unemp    -5.4525e+02  6.8611e+01  -7.9469 7.614e-15 ***
lag(gsp)  1.0125e+00  9.1789e-03 110.3029 < 2.2e-16 ***


## Stata code
use Produc, clear
xtset state year, yearly
xtreg gsp unemp l.gsp i.year, fe

## xtreg output
------------------------------------------------------------------------------
         gsp |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       unemp |   -545.246   68.61136    -7.95   0.000    -679.9537   -410.5383
      gsp L1.|   1.012464   .0091789   110.30   0.000     .9944422    1.030485
-------------+----------------------------------------------------------------

Solved – Fixed Effects Gravity Model for forecasting..with time constant variables Stata

What they do in the paper is that they estimate their gravity model, say equation 5.2, using the fixed effects estimator and they estimate the fixed effects directly to use them later in equation 6. You can do this with the predict command after xtreg. In Stata this would be:

xtreg IX lYi lYj lNi lNj lD  lIi lIj Pij1 - Pijh
predict IE, u

In the fixed effects regression all the time-invariant variables drop out as the authors stated. The predict command then gives you the individual effects $\text{IE}$ which they use in equation 6.

With regards to your note I'm not sure if the same procedure applies to xtpoisson given that the interpretation of the estimated fixed effects changes. For this have a look at a similar question on the Statalist with the corresponding answer by Maarten Buis. He is also active on CV so if you're lucky he can provide you with guidance on this. Otherwise I would guess that Martinez-Zarzoso and Nowak-Lehmann had the same problem with the many zeros (I suppose their data is similar to yours given the similarity of the application) and yet the had their reasons to stick to linear models.
I hope this helps.

Best Answer

Related Solutions

Solved – Difference between fixed effects models in R (plm) and Stata (xtreg)

Solved – Fixed Effects Gravity Model for forecasting..with time constant variables Stata

Related Question