Solved – What are the consequences of including unnecessary random effects

aicglmmmixed modelrandom-effects-model

In mixed models (GLMMs), random effects are often used to account for non-independence between observations e.g. of the same patient, or of animals from the same farm. I sometimes see multiple random effects being included just because there might be an effect. What are the consequences of including multiple random effects if they don't explain additional variance?

I suspect it will affect the degrees of freedom and make the test overly conservative. At least in p-value based hypothesis testing.
If several models are compared in an AIC framework but all have the same random effect structure will there be any effect on which fixed effect structure is identified as best?
Will the estimates of fixed and/or random effects be biased or their standard errors increased?
Will model convergence be impeded? And hence is the inference from a bayesian/MCMC model such as MCMCglmm (in R) likely compromised?

Best Answer

Barr, Levy, Scheepers & Tily (2013) present an argument and simulations for why you should (by default) use the maximal random effects structure justified by your design.

The crux of the argument is that the maximal model will generalize better. The paper also provides an argument for why it is anti-conservative to use the maximal model (pt. 1). More generally, by allowing random slopes and intercepts you're more likely to get a better fit to the data and thus better detect when variance is attributable to the fixed effects.

(2) In model comparison with models that feature the same random effects structure, only differences in the fixed effects should affect AIC. However, if you're using the maximal model, adding a fixed effect will necessitate adding random slope and intercept terms.

It's not a great idea to use the sample data to determine if random effects are "necessary". Just because the inclusion of random effects doesn't explain variance in your current dataset doesn't mean that it is not important to the population you're making inferences about.

You're right to suspect that the model is more likely to fail to converge with more complicated random effects structures (pt. 4). You'll have to balance the benefits of using the maximal random effects structure against the real-world constraints of your data; obviously if the model doesn't converge the results should be treated with suspicion. One possible solution in this situation is to try permutation-based analysis (bootstrapping).

Best Answer

Related Solutions

Solved – Fixed vs Random Effects

Solved – REML or ML to compare two mixed effects models with differing fixed effects, but with the same random effect

Related Question