Solved – Comparing mixed-effects and fixed-effects models (testing significance of random effects)

hypothesis testinglme4-nlmemixed modelr

Given three variables, y and x, which are positive continuous, and z, which is categorical, I have two candidate models given by:

fit.me <- lmer( y ~ 1 + x + ( 1 + x | factor(z) ) )

and

fit.fe <- lm( y ~ 1 + x )

I hope to compare these models to determine which model is more appropriate. It seems to me that in some sense fit.fe is nested within fit.me. Typically, when this general scenario holds, a chi-squared test can be performed. In R, we can perform this test with the following command,

anova(fit.fe,fit.me)

When both models contain random-effects (generated by lmer from the lme4 package), the anova() command works fine. Owing to boundary parameters, it is normally advisable to test the resulting Chi-Square statistic via simulation, nonetheless, we can still use the statistic in the simulation procedure.

When both models contain only fixed-effects, this approach—and, the associated anova() command—work fine.

However, when one model contains random effects and the reduced model contains only fixed-effects, as in the above scenario, the anova() command doesn't work.

More specifically, I get the following error:

 > anova(fit.fe, fit.me)
 Error: $ operator not defined for this S4 class

Is there anything wrong with using the Chi-Square approach from above (with simulation)? Or is this simply a problem of anova() not knowing how to deal with linear models generated by different functions?

In other words, would it be appropriate to manually generate the Chi-Square statistic derived from the models? If so, what are the appropriate degrees of freedom for comparing these models? By my reckoning:

$$ F = \frac{\left((SSE_{reduced}-SSE_{full})/(p-k)\right)}{\left((SSE_{full})/(n-p-1)\right)} \sim F_{p-k,n-p-1} $$

We are estimating two parameters in the fixed effects model (slope and intercept) and two more parameters (variance parameters for the random slope and random intercept) in the mixed-effects model. Typically, the intercept parameter isn't counted in the degrees of freedom computation, so that implies that $k=1$ and $p=k+2=3$; having said that I'm not sure if the variance parameters for the random-effects parameters should be included in the degrees of freedom computation; the variance estimates for fixed-effect parameters are not considered, but I believe that to be because the parameter estimates for fixed effects are assumed to be unknown constants whilst they are considered to be unknowable random variables for mixed effects. I would appreciate some assistance on this issue.

Finally, does anybody have a more appropriate (R-based) solution to comparing these models?

Best Answer

Technically, you can get it working by just switching the order of the parameters:

> anova(fit.me, fit.fe) 

Will work just fine. If you pass an object generated by lmer first, the anova.merMod will be called instead of anova.lm (which does not know how to handle lmer objects). See:

?anova.merMod

Although, choosing a mixed model or a fixed model is a modeling choice which needs to take into account the experimental design, not a model selection problem. See @BenBolker's https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#testing-significance-of-random-effects for more details:

Consider not testing the significance of random effects.

Related Question