Solved – Random effects vs fixed effects for analysis of panel data (econometrics)

econometricsfixed-effects-modelpanel datarandom-effects-modelstata

My dataset is following: 1000 firms, time period of 10 years, 20 countries 20, 15 industries.

I declare in STATA:

xtset firmid year

I want to control for the unobserved heterogeneity using fixed effect:

xtreg Y b(set of independent variables) i.years i.industries, fe vce(robust)    

Does this specification allow me to capture the firm-level heterogeneity?

Then, I use random effects:

xtreg Y b(set of independent variables) i.years i.industries i.countries, mle   

What is the difference between the two specifications (apart that random allows me to include time-invariant variables like countries)?

Best Answer

This specification allows you to capture the time-invariant heterogeneity. The difference between fixed and random effects is the following. For a model $$y_{it} = \alpha + X'_{it}\beta + c_i + \epsilon_i$$ where $y$ is the outcome, $X$ are time-varying controls, $c_i$ are the firms' characteristics that do not change over time, and $\epsilon$ is an error term, and $i$ and $t$ index firms and years, respectively.

Fixed effects estimation eliminates the $c_i$ by utilizing the within transformation or first differencing (for details, see for instance these lecture notes). Random effects on the other hand ignores the $c_i$ and leaves them in the error term. This of course only works if all your explanatory variables $X$ are not correlated with $c_i$. The random effects estimator then uses a matrix weighted average of the within and between variation of your data. The fixed effects estimator only uses the within (i.e. the intra firm) variation. This makes random effects more efficient meaning that the standard errors are smaller and you can include time-invariant variables which is good if you are interested in their coefficients.

In practice, the assumption of random effects is often implausible. You can directly test this using the Hausman test. Whether or not the $X_{it}$ are correlated with $c_i$, the fixed effects estimator is consistent. Random effects is only consistent under the above stated assumption. The Hausman test then compares these two models and, broadly speaking, if their results do not differ significantly, you may as well use random effects. If they differ significantly then you know that the assumptions for random effects are likely to be violated and in that case you better stick with fixed effects.

Related Question