Solved – Concepts of mixed effects in statistics and econometrics, how to cope with them

biostatisticseconometrics

When I search for the definition about fixed-effects, random-effects or mixed-effects model here or elsewhere on the internet, there are a lot of discrepances. My first exposure to linear mixed-effects model was in longitudinal data analysis in Biostatistics. The definition is clear to me that the fixed-effect is the population-averaged effect, and random-effects is the subject-specific effect. Then the mixed-effects model is the model that contains both fixed-effects and random-effects. The mixed-effects model is usually the random effects model because it contains at least one fixed-effects parameters. Like time slope, you have one mean slope for all individuals in the data, and random-effects are those subject-specific slope deviating from the mean slope.

However in Econometrics, the deveoplments of fixed-effects and random-effects models have distinct definitions, which is whether heterogeneity correlates or not with the error term. Some statistical tests were developed to test whether fixed-effects or random-effects model should be used. There are lot of social science analyses adopting the Econometric approach as well. Therefore when I read the discussions about fixed-effects, random-effects or mixed-effects models posted by people from different areas, they always confuse me. Even though sometimes the mathematical defintions are similar, the modelling process and consideration behind it are quite different.

I hope there are some general discussions on Statistics and Econometrics about their discrepance in defintions or concepts rather than methodologies or algorithms used.

Best Answer

Perhaps another way of seeing the difference is to focus on what the "fixed effect" is defined to be. In econometrics, a panel (longitudinal) model is typically specified as $$ y_{it} = X_{it}*b + a_{i} + e_{it} $$ where the $X$ matrix would be called the "right hand side" variables, the "design matrix", or the "independent variables", etc. The $a_i$ is an "unobserved error component". The term "Fixed Effect" or "Random Effect" has to do ONLY with the assumptions about the unobserved component ($a_i$).

If one assumes it is a "fixed effect", then the beta-hat statistics are robust to correlation between $a_i$ and $e_{it}$. That is, the beta-hat statistics are conditional on the "fixed" unobserved component being controlled for. One can either include a dummy variable for each individual $i$ in the data as part of the $X$ matrix to calculate this (bad idea) or it can just be partial-ed out (better idea).

The "Random Effect" assumption about $a_i$ allows for $a_i$ to be a random (unobserved) variable, but assumptions must be made about the independence (or at least lack of correlation) between $a_i$ and $e_{it}$.

Another way to state this is that under most assumptions, assuming that $a_i$ is "fixed" will result in consistent asymptotic estimates, whereas ONLY under the independence assumption will "random" effects be consistent. A Hausman-style test can be used to see if the Random effect assumption is valid. In most of the cases for the observational data that economists use, the random effects assumption (i.e. the assumption of non-correlation between the unobserved random component and the error term) is invalid ... and this is why economists tend to favor the "Fixed-Effect model" when using longitudinal data.

I too have seen a lot of confused jargon in the literature, mostly because people from different disciplines are talking past each other, and the term "Fixed" and Random" when applied to "Effects" are not used to communicate, but rather used as inertial labels, and cause inadvertent confusion. At this stage, most of what goes under the rubric of "Mixed Models" would simply be the "Random Effects" model from the typical Econometrician's perspective (which would tend to use the label "random coefficient model" for the equivalent math). That is, all the worry economists have about the inconsistency of the random effects assumption for panel data would (in observational data) still hold for any Mixed Model.