Solved – Different p-values for fixed effects in summary() of glmer() and likelihood ratio test comparison in R

hypothesis testinglme4-nlmemixed modelp-valuer

I'm using glmer() with a binomial response variable. My optimal model has two fixed effects (flow and DNA) which in summary() show a non-significant p value but when I remove each fixed effect in turn from the model the likelihood ratio test comparing the two models shows a significant p value. I'm struggling to understand (1) if this is normal, and (2) how to report the results if the explanatory variables "flow" and "DNA" are important but their p values in the model are well above 0.05?

Optimal model:

a25 <- glmer(Status_qpcr~(1|Root)+Flow+DNA,
             family=binomial, data=spore)
summary(a25)

Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']  
Family: binomial  ( logit ) 
Formula: Status_qpcr ~ (1 | Root) + Flow + DNA   
Data: spore
      AIC      BIC   logLik deviance df.resid 
     72.9     81.0    -32.4     64.9       52 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.9318 -0.8163  0.4435  0.6848  1.6133 

Random effects:  
  Groups Name        Variance Std.Dev.  
  Root   (Intercept) 0.3842   0.6199   
  Number of obs: 56, groups:  Root, 9

Fixed effects:
Estimate Std. Error z value Pr(>|z|)   
(Intercept) -0.97752    0.79252  -1.233    0.217   
Flow         3.82779    2.27165   1.685    0.092 . 
DNA          0.01616    0.01039   1.556    0.120  
--- 
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
     (Intr) Flow   Flow -0.775        
     DNA    -0.576  0.227

Likelihood ratio test:

a26 <- update(a25,~.-DNA)
anova(a25,a26)

Data: spore 
Models: 
    a26: Status_qpcr ~ (1 | Root) + Flow 
    a25: Status_qpcr ~ (1 | Root) + Flow + DNA
    Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)   
a26  3 74.802 80.878 -34.401   68.802                            
a25  4 72.897 80.998 -32.448   64.897 3.9049      1    0.04815 *

a27 <- update(a25,~.-Flow)
anova(a25,a27)

Data: spore 
Models: 
    a27: Status_qpcr ~ (1 | Root) + DNA 
    a25: Status_qpcr ~ (1 | Root) + Flow + DNA
    Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)
a27  3 78.440 84.723 -36.220   72.440                             
a25  4 72.897 80.998 -32.448   64.897 7.5427      1   0.006025 **

Best Answer

It looks like you are seeing the difference between Wald p-values (based on the curvature of the log-likelihood surface at the maximum likelihood estimate) and likelihood ratio test p-values (based on comparisons between the full and reduced models).

take a look at tpr <- profile(a25,which="beta_"); lattice::xyplot(tpr). You should see that the lines are far from straight (straight lines would indicate a log-quadratic likelihood surface, which is what's assumed by Wald p-values)
compare the results of confint(a25,which="beta_") (likelihood ratio intervals) and confint(a25,which="beta_",method="Wald"); they should be quite different.

LRT CI/p-values are essentially always better than the Wald equivalents (but much slower to compute, which is why Wald p-values are the default in summary()).

Related Solutions

Model Selection in Longitudinal Data – Testing the Need for Random-Effects Terms in Longitudinal Data Analysis

The likelihood ratio test is slightly incorrect (in general, conservative) for testing the significance of a random effect, because the null value ($\sigma^2=0$) is at the boundary of the feasible space, but in this case there is overwhelmingly strong evidence against the null hypothesis. The model with random effects of individual is 15713-6772=8941 log-likelihood units better; twice the log-likelihood value is $\chi^2$ distributed, so the direct p-value calculation would give you ...

pchisq(2*8941,df=1,lower.tail=FALSE,log.p=TRUE)/log(10)
## -3885.251

... a p-value of approximately $10^{-3885}$.

You should really consider a random-slope model (random = ~time|id) as well.

Update: relative to the random-intercept model, the random-slopes model is again much better. The improvement is now 935 log-likelihood units, which doing the equivalent calculation as above corresponds to a rejection of the null hypothesis (among-individual variation in slope is equal to zero) with a p-value of "only" $10^{-408}$.

Mixed Model – How to Compare Mixed-Effect and Fixed-Effect Generalised Linear Models Using BIC

As far as I can tell, you can compare the likelihoods of glmer() and glm() models, at least for family=binomial (haven't tested this for other families). If the variance components are estimated to be zero, then the likelihood should be identical and that is clearly the case. Here is an example to illustrate this:

dat <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 
6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 
9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 
12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 
14L, 14L, 15L, 15L, 15L, 15L, 15L, 16L, 16L, 16L, 16L, 16L, 17L, 
17L, 17L, 17L, 17L, 18L, 18L, 18L, 18L, 18L, 19L, 19L, 19L, 19L, 
19L, 20L, 20L, 20L, 20L, 20L), xi = c(0, 0, 0, 0, 0, -1, -1, 
-1, -1, -1, -1, -1, -1, -1, -1, 0.8, 0.8, 0.8, 0.8, 0.8, -0.9, 
-0.9, -0.9, -0.9, -0.9, 0.7, 0.7, 0.7, 0.7, 0.7, 0.1, 0.1, 0.1, 
0.1, 0.1, -1.7, -1.7, -1.7, -1.7, -1.7, 0.3, 0.3, 0.3, 0.3, 0.3, 
-2.8, -2.8, -2.8, -2.8, -2.8, 2.7, 2.7, 2.7, 2.7, 2.7, -0.1, 
-0.1, -0.1, -0.1, -0.1, -0.2, -0.2, -0.2, -0.2, -0.2, 2, 2, 2, 
2, 2, -0.6, -0.6, -0.6, -0.6, -0.6, 1.1, 1.1, 1.1, 1.1, 1.1, 
0.2, 0.2, 0.2, 0.2, 0.2, -0.4, -0.4, -0.4, -0.4, -0.4, 2, 2, 
2, 2, 2, -1.1, -1.1, -1.1, -1.1, -1.1), xij = c(1.1, 1.1, 0.2, 
0.9, 0.4, -2.1, -0.4, -0.7, 0, 0.8, -0.4, 0.2, -1, 0, -1.2, 1.1, 
1.9, 0.9, -1.4, -0.8, -0.3, -0.7, 0.7, -1.2, 1.1, -1.5, 0.3, 
-1.7, -2, 0.2, 2, -0.5, -1.2, -0.2, -2.3, -0.6, -0.6, -1.6, -0.4, 
-1.5, -0.5, 0.8, 0.1, -0.3, -0.7, 0.7, 0.3, -0.4, 0.4, 0.5, -0.8, 
0.6, 0.3, 0.6, 0.2, -0.8, 0, -2.3, 0.5, 0, 0.9, 0.6, 2.2, 0.6, 
-0.3, 0.3, 0.5, -2.2, 2, -0.6, -0.7, -0.3, -0.7, 1.7, -0.7, -0.3, 
0.6, -0.9, -1.9, -0.5, 1.6, -0.5, 0.4, 1.1, 0.5, -1.8, 1.2, 1.7, 
-1.1, 0.2, -0.6, -1.1, 2.1, 0.4, 0.9, 0.5, -2, 1.6, 0.1, 0.7), 
    yi = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 
    0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L)), .Names = c("id", 
"xi", "xij", "yi"), row.names = c(NA, -100L), class = "data.frame")

library(lme4)

res0 <- glm(yi ~ xi + xij, data=dat, family=binomial)
summary(res0)

res1 <- glmer(yi ~ xi + xij + (1 | id), data=dat, family=binomial)
summary(res1)

logLik(res0)
logLik(res1)
anova(res1, res0)

The last three lines yield:

> logLik(res0)
'log Lik.' -29.96427 (df=3)
> logLik(res1)
'log Lik.' -29.96427 (df=4)
> 
> anova(res1, res0)
Data: dat
Models:
res0: yi ~ xi + xij
res1: yi ~ xi + xij + (1 | id)
     Df    AIC    BIC  logLik deviance Chisq Chi Df Pr(>Chisq)
res0  3 65.929 73.744 -29.964   59.929                        
res1  4 67.929 78.349 -29.964   59.929     0      1          1

So, the (log)-likelihoods are identical, since the id level variance component is estimated to be zero. The AIC value of the mixed-effects model is therefore 2 points larger, as expected (since the model has one more parameter).

One thing to note though: The default for glmer() is nAGQ=1, which means that the Laplace approximation is used. Let's use "proper" adaptive quadrature:

res1 <- glmer(yi ~ xi + xij + (1 | id), data=dat, family=binomial, nAGQ=7)
logLik(res0)
logLik(res1)
anova(res1, res0)

This yields:

>     logLik(res0)
'log Lik.' -29.96427 (df=3)
>     logLik(res1)
'log Lik.' -29.96427 (df=4)
>     anova(res1, res0)
Error in anova.merMod(res1, res0) : 
  GLMMs with nAGQ>1 have log-likelihoods incommensurate with glm() objects

The variance component is still estimated to be zero and the (log)-likelihoods are identical. But anova() spits out an error that indicates that these models should not not be compared against each other.

Best Answer

Related Solutions

Model Selection in Longitudinal Data – Testing the Need for Random-Effects Terms in Longitudinal Data Analysis

Mixed Model – How to Compare Mixed-Effect and Fixed-Effect Generalised Linear Models Using BIC

Related Question