Solved – Fixed effects or Random effects model

dataseteconometricsfixed-effects-modelpanel datarandom-effects-model

I am trying to understand the difference between fixed and random effects modelling. The panel data I have is in the form of basic longitudinal panel time series.

I know that I can use the Hauseman test to determinate which model to use. But my problem is that in fixed effects model I have to use fixed slopes or there won't be a coefficient for the whole model available. In random effects I can use random slopes and intercept and still get a slope coefficient for the whole model.

So how can I determinate whether to use random effects model with varying intercept and coefficient or fixed effects model using only varying coefficient and fixed slope?

Best Answer

A couple of different things your post brings up (hopefully you recognize that).

The first relates to deciding random vs fixed effects. In my experience deciding between fixed and random effects has two pieces:

  • Statistical fit. Assessed using things like a Hausman test, standard fare in most packages like Stata, SAS, R, etc. This will tell you if a random intercept "works" better with your data than a fixed effect.

  • Theoretical fit. How are you conceptualizing the effect? Is it truly a fixed, unchanging entity, not coming from a theoretical distribution? For example, I have rarely seen States treated like random effects - there are 50 and only 50 states (or 51 +DC, or more if you add territories), and they never change being the same states. Same thing with years when there are a few years in the panel, those are often treated as fixed, because you want to capture a common shock to all observations in that year and quantify it as a fixed effect. Other things, however, are not so clear. I'm doing an analysis of counties - of course you could treat counties as fixed effects, but there are over 3,000, and I don't think anyone would really want to be so focused on a single county. So I'm treating them as a random effect, coming from a distribution. When doing repeated measures, again, the individual is treated as random (representative hopefully of a larger population that has parameters you estimate).

The second issue you bring up is intercept vs slope. In this regard random effects and fixed effects are not comparable. A fixed effect literally just adjusts the intercept for each fixed effect - it captures the mean relationship between the given effect and the outcome variable. What that results in, is the slopes you have are within group effects, because you've already captured the variance attributable to the fixed effects. If you think think that the slope is different for each effect, the interpretation is that the effects moderate the effect of the given variable whose slope you are interested in:

$y=\alpha+\beta x + \Sigma{Z_i\theta_i}$

Where X is your covariate of interest, and Z is your vector of each fixed effect i and $\theta_i$ is the effect. Now, if you are thinking the slope is going to differ for each, you actually create an interaction term or moderator for each effect (which explodes the number of coefficients and cuts your degrees of freedom):

$y=\alpha+\beta x + \Sigma{Z_i\theta_i}+\Sigma{Z_ix\gamma_i}$

$\gamma$ is your vector of coefficients, so to get the slope for any given effect you need to add up the $\beta$ with the relevant $\gamma$. So what does $\beta$ mean here? It is not the average slope across all the effects, as it is in a random effect model. It is the slope for the omitted effect.

So, to get back to your original question: - Start with the theoretical definition of your effects. Are they truly fixed, or do they reflect a random distribution, that has population parameters you can estimate? Answer that and then I believe you should be on your way to figuring out the next step. - If you believe that the effects should indeed be random, then do your statistical tests. Test if random effects fit better. Then do your appropriate tests to see if a random coefficient is appropriate.

Related Question