Solved – Allowed comparisons of mixed effects models (random effects primarily)

likelihood-ratiolme4-nlmemixed modelr

I've been looking at mixed effects modelling using the lme4 package in R. I'm primarily using the lmer command so I'll pose my question through code that uses that syntax. I suppose a general easy question might be, is it OK to compare any two models constructed in lmer using likelihood ratios based on identical datasets? I believe the answer to that must be, "no", but I could be incorrect. I've read conflicting information on whether the random effects have to be the same or not, and what component of the random effects is meant by that? So, I'll present a few examples. I'll take them from repeated measures data using word stimuli, perhaps something like Baayen (2008) would be useful in interpreting.

Let's say I have a model where there are two fixed effects predictors, we'll call them A, and B, and some random effects… words and subjects that perceived them. I might construct a model like the following.

m <- lmer( y ~ A + B + (1|words) + (1|subjects) )

(note that I've intentionally left out data = and we'll assume I always mean REML = FALSE for clarity's sake)

Now, of the following models, which are OK to compare with a likelihood ratio to the one above and which are not?

m1 <- lmer( y ~ A + B + (A+B|words) + (1|subjects) )
m2 <- lmer( y ~ A + B + (1|subjects) )              
m3 <- lmer( y ~ A + B + (C|words) + (A+B|subjects) )
m4 <- lmer( y ~ A + B + (1|words) )                 
m5 <- lmer( y ~ A * B + (1|subjects) )   

I acknowledge that the interpretation of some of these differences may be difficult, or impossible. But let's put that aside for a second. I just want to know if there's something fundamental in the changes here that precludes the possibility of comparing. I also want to know whether, if LRs are OK, and AIC comparisons as well.

Best Answer

Using maximum likelihood, any of these can be compared with AIC; if the fixed effects are the same (m1 to m4), using either REML or ML is fine, with REML usually preferred, but if they are different, only ML can be used. However, interpretation is usually difficult when both fixed effects and random effects are changing, so in practice, most recommend changing only one or the other at a time.

Using the likelihood ratio test is possible but messy because the usual chi-squared approximation doesn't hold when testing if a variance component is zero. See Aniko's answer for details. (Kudos to Aniko for both reading the question more carefully than I did and reading my original answer carefully enough to notice that it missed this point. Thanks!)

Pinhiero/Bates is the classic reference; it describes the nlme package, but the theory is the same. Well, mostly the same; Doug Bates has changed his recommendations on inference since writing that book and the new recommendations are reflected in the lme4 package. But that's more than I want to get into here. A more readable reference is Weiss (2005), Modeling Longitudinal Data.