Solved – A fixed effects regression in Stata that changes when a time invariant variable is included

fixed-effects-modelstata

I am running a fixed effects regression in Stata. If I understood it correctly, a variable that is fixed over time will be omitted from the regression. Hence, it should not influence the regression results. This is not the case, however. I ran a fixed effects regression with a time invariant variable. Stata omitted this variable from the regression. When I manually removed it and ran the regression again, the predicted outcome from another variable kept significant at the 5%, but changed sign! Any plausible explanation for this?

Best Answer

The reason for not having time-invariant variables is high correlation with fixed effects. Depending on whether the variable is generally time-invariant or perfectly time invariant, it may not necessarily drop from the regression. (Stata will drop variables with perfect collinearity, but generally not imperfect.) However, high multicollinearity in the model generally biases standard errors, affecting significance...so I am not sure what to make of your changed-sign variable.

As for the fixed-effects choice and modeling overall:

I do respectfully disagree with some of the previous answer. While spatial fixed effects--attached to the unit of analysis, such as states in a country, individuals in a treatment program, etc.--are not controlling for time, per se, they are still controlling for factors that do not change much over time, and thus they do need a cautious approach to including time-invariant regressors in the model. Example: say you wanted to test the effect of a country's geography, conceptualized as percentage of the nation that was mountainous, on guerrilla warfare prevalence over time (there are some articles on that very subject). An FE model would not be ideal, because, while FE would help account for the unobserved fixed effects of the countries studied (whatever makes Argentina, Canada, Uzbekistan, etc. unique that isn't included in the model), percent mountainous would most likely be unvarying, save perhaps for the occasional volcanic eruption or reconceptualization of what counts as mountainous. Multicollinearity would still be a problem. The fixed effect of each country and the percent mountainous would perfectly, or very nearly perfectly, covary.

(I believe that what the previous answer is referring to as fixed effects over time is more often accounted for by time dummies or trend variables than by using , fe (Stata code). For example, you are studying the crop production of Midwestern cities. 1975 had a massive drought, but you have no drought variable. Using time dummies would help account for that, incorporating the unmeasured unique effects of certain years, such as 1975)

If it is highly varying variables you want to examine, stick with fixed effects. If you want to know the effect of variables with much smaller/slower changes, try random effects. If you're not sure, try a Hausman test and see how the models compare.