Solved – Differences between summary and anova function for multilevel (lmer) model

hypothesis testinglme4-nlmemixed modelr

I've been working on some multilevel models using the lmer function in R and have been playing with some different ways of testing the significance of the fixed effects of my model. I have found that the summary() function and the anova() function from lmertest yield different results. My understanding is that the anova function should test whether any of my groups differs from the intercept, whereas the summary function displays the significance of the deviance of individual groups from the intercept. However, I am finding that the anova function does not return a significant interaction effect Origin:Fert, whereas the summary function reports that OriginCO:FertUnfertilized is significant.

What gives? Am I missing something here?

> mod_rs_Origin_lmer_nelder=lmer(rs_feedback ~ 
+                                    Date_of_Emergence  + Origin*Fert +  (1 | Soil_ID), data=data,
+                                  control = lmerControl(optimizer ="Nelder_Mead"))
> anova(mod_rs_Origin_lmer_nelder, type=2)
Analysis of Variance Table of type II  with  Satterthwaite 
approximation for degrees of freedom
                  Sum Sq Mean Sq NumDF DenDF F.value   Pr(>F)   
Date_of_Emergence 1.3155 1.31552     1   148  4.6081 0.033450 * 
Origin            2.6584 0.66461     4   148  2.3281 0.058853 . 
Fert              2.9384 2.93838     1   148 10.2928 0.001637 **
Origin:Fert       2.1927 0.54817     4   148  1.9202 0.110035   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> summary(mod_rs_Origin_lmer_nelder)
Linear mixed model fit by REML t-tests use Satterthwaite approximations to degrees of  freedom
[lmerMod]
Formula: rs_feedback ~ Date_of_Emergence + Origin * Fert + (1 | Soil_ID)
   Data: data
Control: lmerControl(optimizer = "Nelder_Mead")

REML criterion at convergence: 272

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.6043 -0.6106 -0.2517  0.4541  4.7311 

Random effects:
 Groups   Name        Variance Std.Dev.
 Soil_ID  (Intercept) 0.0000   0.0000  
 Residual             0.2855   0.5343  
Number of obs: 159, groups:  Soil_ID, 4

Fixed effects:
                            Estimate Std. Error         df t value Pr(>|t|)   
(Intercept)                 0.225550   0.134766 148.000000   1.674  0.09631 . 
Date_of_Emergence          -0.007822   0.003644 148.000000  -2.147  0.03345 * 
OriginCO                   -0.114934   0.180923 148.000000  -0.635  0.52624   
OriginF                    -0.197089   0.190659 148.000000  -1.034  0.30295   
OriginQM                   -0.027523   0.187279 148.000000  -0.147  0.88336   
OriginQR                   -0.030363   0.178115 148.000000  -0.170  0.86487   
FertUnfertilized            0.524999   0.186802 148.000000   2.810  0.00562 **
OriginCO:FertUnfertilized  -0.577240   0.261952 148.000000  -2.204  0.02910 * 
OriginF:FertUnfertilized    0.043589   0.281231 148.000000   0.155  0.87704   
OriginQM:FertUnfertilized  -0.421518   0.270105 148.000000  -1.561  0.12076   
OriginQR:FertUnfertilized  -0.248637   0.258104 148.000000  -0.963  0.33696   

Best Answer

It looks like there are 2 levels of Fert and 5 levels of Origin, correct?

I believe the anova() output shows the omnibus test, while the summary() function shows regression coefficients that represent specific contrasts, which are defined by the reference group (i.e., whatever level is first).

Origin:Fert is showing the omnibus interaction term significance. OriginCO:FertUnfertilized tests if the difference between the reference Origin and CO Origin depends on whether or not it is Fertilized or Unfertilized.

The authors of the lme4 package didn't want to include dfs or p-values into their output, because estimation of these dfs is tricky for multilevel models. Instead, they suggested comparing nested models. If you are looking for the omnibus effect of the interaction, for example, I would compare a model with it and a model without it:

mod0 <- lmer(rs_feedback ~ Date_of_Emergence  + Origin + Fert +  (1 | Soil_ID),
             data=data, control = lmerControl(optimizer ="Nelder_Mead"))

mod1 <- lmer(rs_feedback ~ Date_of_Emergence  + Origin*Fert +  (1 | Soil_ID),
             data=data, control = lmerControl(optimizer ="Nelder_Mead"))

anova(mod1, mod0, refit=FALSE)