Mixed Model – How to Compare Mixed-Effect and Fixed-Effect Generalised Linear Models Using BIC

aicgeneralized linear modelmixed model

Assuming I have two generalised linear models, one with only fixed effects and
another with fixed and random effects, how can I compare which model is most
parsimonious using AIC/BIC? I can only fit mixed-effect models with glmer, since
it returns an error when trying to fit a fixed effect model. I can only fit
fixed effect models with glm since this does not allow random effects. This
leaves me with two sets of models, one set fitted with glmer and another set
fitted with glm.

The accepted answer to the question Can AIC compare across different types of
model? states that (with my own emphasis):

It depends. AIC is a function of the log likelihood. If both types of model
compute the log likelihood the same way (i.e. include the same constant) then
yes you can, if the models are nested.

I'm reasonably certain that glm() and lmer() don't use comparable log
likelihoods.

The point about nested models is also up for discussion. Some say AIC is only
valid for nested models as that is how the theory is presented/worked
through. Others use it for all sorts of comparisons.

The DRAFT r-sig-mixed-models FAQ states that:

Can I use AIC for mixed models? How do I count the number of degrees of
freedom for a random effect?

Yes, with caution.

This page then follows by listing a number of caveats and suggestions such as
using modified AIC calculations or REML=FALSE.

As a novice in this field, I have found that reading about both the use of
AIC/BIC for comparing models, and using mixed effect models for group effects
to be very eye-opening and also intuitively "common sense" approaches. How to
put these two approaches together has left me feeling like I've fallen through
the cracks somewhat, there doesn't seem to be a clear answer. Surely if I
follow a the AIC/BIC parsimony approach, then I should compare the fixed-effect
model with mixed effect model with AIC to see if the random effect deserves to
be there?

Am I missing something obvious? How are mixed-effect and fixed-effect
generalised linear models usually compared?

Best Answer

As far as I can tell, you can compare the likelihoods of glmer() and glm() models, at least for family=binomial (haven't tested this for other families). If the variance components are estimated to be zero, then the likelihood should be identical and that is clearly the case. Here is an example to illustrate this:

dat <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 
6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 
9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 
12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 
14L, 14L, 15L, 15L, 15L, 15L, 15L, 16L, 16L, 16L, 16L, 16L, 17L, 
17L, 17L, 17L, 17L, 18L, 18L, 18L, 18L, 18L, 19L, 19L, 19L, 19L, 
19L, 20L, 20L, 20L, 20L, 20L), xi = c(0, 0, 0, 0, 0, -1, -1, 
-1, -1, -1, -1, -1, -1, -1, -1, 0.8, 0.8, 0.8, 0.8, 0.8, -0.9, 
-0.9, -0.9, -0.9, -0.9, 0.7, 0.7, 0.7, 0.7, 0.7, 0.1, 0.1, 0.1, 
0.1, 0.1, -1.7, -1.7, -1.7, -1.7, -1.7, 0.3, 0.3, 0.3, 0.3, 0.3, 
-2.8, -2.8, -2.8, -2.8, -2.8, 2.7, 2.7, 2.7, 2.7, 2.7, -0.1, 
-0.1, -0.1, -0.1, -0.1, -0.2, -0.2, -0.2, -0.2, -0.2, 2, 2, 2, 
2, 2, -0.6, -0.6, -0.6, -0.6, -0.6, 1.1, 1.1, 1.1, 1.1, 1.1, 
0.2, 0.2, 0.2, 0.2, 0.2, -0.4, -0.4, -0.4, -0.4, -0.4, 2, 2, 
2, 2, 2, -1.1, -1.1, -1.1, -1.1, -1.1), xij = c(1.1, 1.1, 0.2, 
0.9, 0.4, -2.1, -0.4, -0.7, 0, 0.8, -0.4, 0.2, -1, 0, -1.2, 1.1, 
1.9, 0.9, -1.4, -0.8, -0.3, -0.7, 0.7, -1.2, 1.1, -1.5, 0.3, 
-1.7, -2, 0.2, 2, -0.5, -1.2, -0.2, -2.3, -0.6, -0.6, -1.6, -0.4, 
-1.5, -0.5, 0.8, 0.1, -0.3, -0.7, 0.7, 0.3, -0.4, 0.4, 0.5, -0.8, 
0.6, 0.3, 0.6, 0.2, -0.8, 0, -2.3, 0.5, 0, 0.9, 0.6, 2.2, 0.6, 
-0.3, 0.3, 0.5, -2.2, 2, -0.6, -0.7, -0.3, -0.7, 1.7, -0.7, -0.3, 
0.6, -0.9, -1.9, -0.5, 1.6, -0.5, 0.4, 1.1, 0.5, -1.8, 1.2, 1.7, 
-1.1, 0.2, -0.6, -1.1, 2.1, 0.4, 0.9, 0.5, -2, 1.6, 0.1, 0.7), 
    yi = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 
    0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L)), .Names = c("id", 
"xi", "xij", "yi"), row.names = c(NA, -100L), class = "data.frame")

library(lme4)

res0 <- glm(yi ~ xi + xij, data=dat, family=binomial)
summary(res0)

res1 <- glmer(yi ~ xi + xij + (1 | id), data=dat, family=binomial)
summary(res1)

logLik(res0)
logLik(res1)
anova(res1, res0)

The last three lines yield:

> logLik(res0)
'log Lik.' -29.96427 (df=3)
> logLik(res1)
'log Lik.' -29.96427 (df=4)
> 
> anova(res1, res0)
Data: dat
Models:
res0: yi ~ xi + xij
res1: yi ~ xi + xij + (1 | id)
     Df    AIC    BIC  logLik deviance Chisq Chi Df Pr(>Chisq)
res0  3 65.929 73.744 -29.964   59.929                        
res1  4 67.929 78.349 -29.964   59.929     0      1          1

So, the (log)-likelihoods are identical, since the id level variance component is estimated to be zero. The AIC value of the mixed-effects model is therefore 2 points larger, as expected (since the model has one more parameter).

One thing to note though: The default for glmer() is nAGQ=1, which means that the Laplace approximation is used. Let's use "proper" adaptive quadrature:

res1 <- glmer(yi ~ xi + xij + (1 | id), data=dat, family=binomial, nAGQ=7)
logLik(res0)
logLik(res1)
anova(res1, res0)

This yields:

>     logLik(res0)
'log Lik.' -29.96427 (df=3)
>     logLik(res1)
'log Lik.' -29.96427 (df=4)
>     anova(res1, res0)
Error in anova.merMod(res1, res0) : 
  GLMMs with nAGQ>1 have log-likelihoods incommensurate with glm() objects

The variance component is still estimated to be zero and the (log)-likelihoods are identical. But anova() spits out an error that indicates that these models should not not be compared against each other.

Best Answer

Related Solutions

Model Selection AIC – Comparing Across Different Types of Models

Solved – How should mixed effects models be compared and or validated

Related Question