I'm doing a two-factor ANOVA using the lmerTest
package. Each factor has multiple levels. When one (or more) of the effects are significant, I would like to do a post-hoc test to determine which of the levels differ from each other. Here, I set up the model as:
library('lmerTest')
model = lmer('measure~factor*experiment+(1|subject_id)', data=data)
print(anova(model))
The output appears as follows:
Analysis of Variance Table of type 3 with Satterthwaite
approximation for degrees of freedom
Df Sum Sq Mean Sq F value Denom Pr(>F)
factor 3 2388.82 796.27 16.3140 9.999 0.0003527 ***
experiment 3 254.11 84.70 2.7689 30.000 0.0588323 .
factor:experiment 9 1301.40 144.60 2.9626 30.000 0.0121071 *
At this point, I can inspect the factors and see which factors are significant. However, when I look at the summary of the model:
summary(model)
I get a much more detailed output (truncated to the relevant portion for clarity):
t value Pr(>|t|)
(Intercept) -5.600 8.99e-06 ***
factorLevel1 0.289 0.77522
factorLevel2 2.855 0.00871 **
factorLevel3 -6.535 9.00e-07 ***
experimentSession1 -0.747 0.46086
experimentSession2 -0.825 0.41596
experimentSession3 0.317 0.75354
factorLevel1:experimentSession1 -1.297 0.20454
factorLevel2:experimentSession1 -0.903 0.37376
factorLevel3:experimentSession1 3.025 0.00506 **
factorLevel1:experimentSession2 0.591 0.55917
factorLevel2:experimentSession2 -0.777 0.44341
factorLevel3:experimentSession2 3.027 0.00504 **
factorLevel1:experimentSession3 -0.123 0.90269
factorLevel2:experimentSession3 -1.060 0.29770
How do I interpret these values? Is this telling me that coefficient for factorLevel2
is significantly different from 0? If I then do a Multiple Comparison of Means:
print(summary(glht(m, linfct=mcp(experiment="Tukey", factor="Tukey"))))
I get the following output:
experiment: Session1 - Session0 == 0 -0.747 0.9829
experiment: Session2 - Session0 == 0 -0.825 0.9718
experiment: Session3 - Session0 == 0 0.317 0.9999
experiment: Session2 - Session1 == 0 -0.078 1.0000
experiment: Session3 - Session1 == 0 1.064 0.9079
experiment: Session3 - Session2 == 0 1.142 0.8759
factor: Level2 - Level1 == 0 0.289 0.9999
factor: Level3 - Level1 == 0 2.855 0.0425 *
factor: Level4 - Level1 == 0 -6.535 <0.001 ***
factor: Level3 - Level2 == 0 1.517 0.6542
factor: Level4 - Level2 == 0 -5.395 <0.001 ***
factor: Level4 - Level3 == 0 -8.341 <0.001 ***
This is more understandable as it is telling me which pairwise factors are significantly different. But, I'm unsure how to interpret the summary table produced by summary()
and how the numbers compare to the table produced by glht()
.
Best Answer
The difference between the ANOVA table and the
lmer
summary output is that the ANOVA table reports whether at least one of the levels within your independent variable (IV) is significantly different. In your example this was the case for the IVfactor
as well as for the interactionfactor:experiment
, but it was not significant forexperiment
. This type of table is usually reported when performing an ANOVA, which is nothing else than a regression with categorical predictor variables and a continuous outcome.The
lmer
output on the other hand (which is the same for any other linear model output), gives you the difference of each factor level (and factor level combination) compared to a baseline, which is the intercept, and tests whether this difference is significant (the test on the intercept itself simply compares whether the estimate is different from zero). By defaultR
sorts factor levels alphabetically and will take the first level of the first IV as the intercept. The estimate for the intercept is the mean of that first level. All this can be quickly illustrated using thePlantGrowth
dataset (?PlantGrowth
) as a simple example:Fit a linear model (regression):
Now calculate the means for each group:
Finally by using the intercept (mean value of
ctrl
) and the estimates you can calculate the mean values of the groups:Since regression and ANOVA are the same thing (except that regression uses continuous predictor variables and compares slopes whereas ANOVA uses categorical predictor variables and compares means) we can simply display the ANOVA table by using the
summary.aov()
function:Which is the same as:
From there you can go on and do multiple mean comparisons (or post-hoc tests) to see which levels of your factor(s) are significantly different, e.g.: