Solved – Output of fixed effects summary in lmerTest in R and post-hoc tests

lme4-nlmepost-hocr

I'm doing a two-factor ANOVA using the lmerTest package. Each factor has multiple levels. When one (or more) of the effects are significant, I would like to do a post-hoc test to determine which of the levels differ from each other. Here, I set up the model as:

library('lmerTest')
model = lmer('measure~factor*experiment+(1|subject_id)', data=data)
print(anova(model))

The output appears as follows:

Analysis of Variance Table of type 3  with  Satterthwaite 
approximation for degrees of freedom
                  Df  Sum Sq Mean Sq F value  Denom    Pr(>F)    
factor             3 2388.82  796.27 16.3140  9.999 0.0003527 ***
experiment         3  254.11   84.70  2.7689 30.000 0.0588323 .  
factor:experiment  9 1301.40  144.60  2.9626 30.000 0.0121071 *  

At this point, I can inspect the factors and see which factors are significant. However, when I look at the summary of the model:

summary(model)

I get a much more detailed output (truncated to the relevant portion for clarity):

                                t value Pr(>|t|)    
(Intercept)                      -5.600 8.99e-06 ***
factorLevel1                      0.289  0.77522    
factorLevel2                      2.855  0.00871 ** 
factorLevel3                     -6.535 9.00e-07 ***
experimentSession1               -0.747  0.46086    
experimentSession2               -0.825  0.41596    
experimentSession3                0.317  0.75354    
factorLevel1:experimentSession1  -1.297  0.20454    
factorLevel2:experimentSession1  -0.903  0.37376    
factorLevel3:experimentSession1   3.025  0.00506 ** 
factorLevel1:experimentSession2   0.591  0.55917    
factorLevel2:experimentSession2  -0.777  0.44341    
factorLevel3:experimentSession2   3.027  0.00504 ** 
factorLevel1:experimentSession3  -0.123  0.90269    
factorLevel2:experimentSession3  -1.060  0.29770    

How do I interpret these values? Is this telling me that coefficient for factorLevel2 is significantly different from 0? If I then do a Multiple Comparison of Means:

print(summary(glht(m, linfct=mcp(experiment="Tukey", factor="Tukey"))))

I get the following output:

experiment: Session1 - Session0 == 0   -0.747   0.9829    
experiment: Session2 - Session0 == 0   -0.825   0.9718    
experiment: Session3 - Session0 == 0    0.317   0.9999    
experiment: Session2 - Session1 == 0   -0.078   1.0000    
experiment: Session3 - Session1 == 0    1.064   0.9079    
experiment: Session3 - Session2 == 0    1.142   0.8759    
factor: Level2 - Level1 == 0            0.289   0.9999    
factor: Level3 - Level1 == 0            2.855   0.0425 *  
factor: Level4 - Level1 == 0           -6.535   <0.001 ***
factor: Level3 - Level2 == 0            1.517   0.6542    
factor: Level4 - Level2 == 0           -5.395   <0.001 ***
factor: Level4 - Level3 == 0           -8.341   <0.001 ***

This is more understandable as it is telling me which pairwise factors are significantly different. But, I'm unsure how to interpret the summary table produced by summary() and how the numbers compare to the table produced by glht().

Best Answer

The difference between the ANOVA table and the lmer summary output is that the ANOVA table reports whether at least one of the levels within your independent variable (IV) is significantly different. In your example this was the case for the IV factor as well as for the interaction factor:experiment, but it was not significant for experiment. This type of table is usually reported when performing an ANOVA, which is nothing else than a regression with categorical predictor variables and a continuous outcome.

The lmer output on the other hand (which is the same for any other linear model output), gives you the difference of each factor level (and factor level combination) compared to a baseline, which is the intercept, and tests whether this difference is significant (the test on the intercept itself simply compares whether the estimate is different from zero). By default R sorts factor levels alphabetically and will take the first level of the first IV as the intercept. The estimate for the intercept is the mean of that first level. All this can be quickly illustrated using the PlantGrowth dataset (?PlantGrowth) as a simple example:

Fit a linear model (regression):

fit <- lm(weight ~ group, data = PlantGrowth)
summary(fit)

Call:
lm(formula = weight ~ group, data = PlantGrowth)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.0710 -0.4180 -0.0060  0.2627  1.3690 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   5.0320     0.1971  25.527   <2e-16 ***
grouptrt1    -0.3710     0.2788  -1.331   0.1944    
grouptrt2     0.4940     0.2788   1.772   0.0877 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6234 on 27 degrees of freedom
Multiple R-squared:  0.2641,    Adjusted R-squared:  0.2096 
F-statistic: 4.846 on 2 and 27 DF,  p-value: 0.01591

Now calculate the means for each group:

require(plyr)
ddply(PlantGrowth,.(group), summarise, MEAN=mean(weight))

  group  MEAN
1  ctrl 5.032
2  trt1 4.661
3  trt2 5.526

Finally by using the intercept (mean value of ctrl) and the estimates you can calculate the mean values of the groups:

5.032-0.3710
[1] 4.661

5.032+0.4940
[1] 5.526

Since regression and ANOVA are the same thing (except that regression uses continuous predictor variables and compares slopes whereas ANOVA uses categorical predictor variables and compares means) we can simply display the ANOVA table by using the summary.aov() function:

summary.aov(fit)

        Df Sum Sq Mean Sq F value Pr(>F)  
group        2  3.766  1.8832   4.846 0.0159 *
Residuals   27 10.492  0.3886                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Which is the same as:

fit2 <- aov(weight ~ group, data = PlantGrowth)
summary(fit2)

            Df Sum Sq Mean Sq F value Pr(>F)  
group        2  3.766  1.8832   4.846 0.0159 *
Residuals   27 10.492  0.3886                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

From there you can go on and do multiple mean comparisons (or post-hoc tests) to see which levels of your factor(s) are significantly different, e.g.:

TukeyHSD(fit2)

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = weight ~ group, data = PlantGrowth)

$group
            diff        lwr       upr     p adj
trt1-ctrl -0.371 -1.0622161 0.3202161 0.3908711
trt2-ctrl  0.494 -0.1972161 1.1852161 0.1979960
trt2-trt1  0.865  0.1737839 1.5562161 0.0120064