Solved – Reporting random effects in mixed models – what is the correct choice

interpretationmixed modelrandom-effects-modelreporting

I have a question regarding the relevance of reporting (various) estimates of random effects in mixed models. Having read multiple papers and forum threads on the issue, I don't feel any closer to understanding what should be reported, so I hope this question could be an opportunity to start a discussion as to which values should be reported under what circumstances.

The typical output for a mixed model (lmer in R) provides the variance and standard deviation for each of the random intercepts and/or slopes, as well as correlations between random intercepts and slopes, if these are specified. Some papers seem to report these values, while some choose to report standard deviation only. Other researchers choose to report ICC, although this seem to be applicable only to models containing a single random intercept, according to an answer provided here: Intraclass Correlation Coefficient in mixed model with random slopes

Others yet, choose to report R2 estimations, although there has been some criticisms regarding applicability of 'variance explained' to mixed models. Finally, some people seem to calculate p-values to establish whether the influence of random effects is significant, although the discussions regarding the point of estimating p values for mixed models is, again, ongoing, and seem to favour other solutions.

Importantly, most solutions are discussed in the context of random intercept-models only, and so what is applicable for models containing random slopes as well, is very much a question mark.

Lastly, most of the discusssions/papers seem to relate to situations where the effects of the random effects are of interest. What should one do in a situation where the random effects are there to account for repeated measures (random effect of the subject) and the variability between different stimuli (random effect of the item)? It seems inappropriate not to report them at all, while also inappropriate to focus on them too much.

The question is of a practical interest to me, since I am currently analysing a mixed model containing subject and item as random effects, as well as a by-subject random slope for the effect of predictor X1. However, having spent many hours going through resources available online, I noticed that very little information is available that can be used by people less used to mixed models or generally not as advanced when it comes to linear modelling. Since mixed models seem to be gaining in popularity, I hope this thread could be quite useful for many researchers.

Thank you in advance for all the input!

Best Answer

As in all things statistics, what you choose to report and interpret is dependent on what you are trying to do with the statistic in the first place. If your hypotheses are not related to random effects, then you don't have to report on them. If you are specifically interested in how items function, then you'll probably want to look carefully at their random effects and correlations. If you want to describe a person's random effect (ability level, if this is indeed an IRT problem), then you'd likely spend some time talking about the variability in those random effects.

In this case, it seems like you're interested in certain predictors (or at least one predictor, X1). If this is true, then you might only need to describe the model you ran and then talk about the effects of the predictors. You also need to think about what is useful for your reader to understand the model as well. Not many people can look at a mixed model table and intuitively understand what the coefficients all mean. It can be helpful to include plots that show the fixed and random effects together, especially when there are random slopes, so that the standard deviation being reported can be contextualized.

Now, I think it's important to also throw out the caveat that good modeling will require that the researcher look at all of this information. Even if you have no interest in the specific values, looking at all the random effects and their relationships can be useful for thinking about whether or not your model is misspecified, identifying potentially interesting patterns, and generally being able to understand what your model is telling you. Personally, I take a model-based approach where the end goal is to develop a cohesive model that summarizes the data and the data generation process. Others, like you described, just want to know p-values for certain effects. Again, it comes down to what question you're trying to answer and what information is needed to shed light on that question