Solved – Running fixed-effects model stata

categorical datafixed-effects-modelstata

I'm running a fixed effects model. My independent variable is store presence, and one of my dependent variables, i.county, measures county fixed effects.

I use xi: regress store_presence i.county other_var othervar2 where county is the US county code, a string variable.

But the regression output reports that some variables "have been omitted due to collinearity".

What should I do to fix this error and capture county-fixed effects in my model?

Best Answer

The fixed effects model uses the within estimator which after adjustments yields same results as LSDV (least squares dummy variables). The within estimator demean each variable by the group means (and adds the global mean in order to "fix" the intercept such that predictions are center around the response variable mean). If county is the panel identifier (PID) set in xtset PID time, then those are already accounted for. The estimates of the PID effects are not consistent so you should not display them anyhow (the within is preferred for computation efficiency, but this argument is noteworthy). If county is a different identifier (PID is individual and you want to control for their county) one can include them in the regression or cluster at the highest dimension. Instead of xtreg y x, robust do xtreg y x, cl(county). Stata allows for encoded variables (categorical variables with efficient storage and functionality, but with labels). Any string variable that should be considered categorical should be encoded as such: encode Strvar, gen(var).