Solved – Output of fixed effects summary in lmerTest in R and post-hoc tests

lme4-nlmepost-hocr

I'm doing a two-factor ANOVA using the lmerTest package. Each factor has multiple levels. When one (or more) of the effects are significant, I would like to do a post-hoc test to determine which of the levels differ from each other. Here, I set up the model as:

library('lmerTest')
model = lmer('measure~factor*experiment+(1|subject_id)', data=data)
print(anova(model))

The output appears as follows:

Analysis of Variance Table of type 3  with  Satterthwaite 
approximation for degrees of freedom
                  Df  Sum Sq Mean Sq F value  Denom    Pr(>F)    
factor             3 2388.82  796.27 16.3140  9.999 0.0003527 ***
experiment         3  254.11   84.70  2.7689 30.000 0.0588323 .  
factor:experiment  9 1301.40  144.60  2.9626 30.000 0.0121071 *

At this point, I can inspect the factors and see which factors are significant. However, when I look at the summary of the model:

summary(model)

I get a much more detailed output (truncated to the relevant portion for clarity):

                                t value Pr(>|t|)    
(Intercept)                      -5.600 8.99e-06 ***
factorLevel1                      0.289  0.77522    
factorLevel2                      2.855  0.00871 ** 
factorLevel3                     -6.535 9.00e-07 ***
experimentSession1               -0.747  0.46086    
experimentSession2               -0.825  0.41596    
experimentSession3                0.317  0.75354    
factorLevel1:experimentSession1  -1.297  0.20454    
factorLevel2:experimentSession1  -0.903  0.37376    
factorLevel3:experimentSession1   3.025  0.00506 ** 
factorLevel1:experimentSession2   0.591  0.55917    
factorLevel2:experimentSession2  -0.777  0.44341    
factorLevel3:experimentSession2   3.027  0.00504 ** 
factorLevel1:experimentSession3  -0.123  0.90269    
factorLevel2:experimentSession3  -1.060  0.29770

How do I interpret these values? Is this telling me that coefficient for factorLevel2 is significantly different from 0? If I then do a Multiple Comparison of Means:

print(summary(glht(m, linfct=mcp(experiment="Tukey", factor="Tukey"))))

I get the following output:

experiment: Session1 - Session0 == 0   -0.747   0.9829    
experiment: Session2 - Session0 == 0   -0.825   0.9718    
experiment: Session3 - Session0 == 0    0.317   0.9999    
experiment: Session2 - Session1 == 0   -0.078   1.0000    
experiment: Session3 - Session1 == 0    1.064   0.9079    
experiment: Session3 - Session2 == 0    1.142   0.8759    
factor: Level2 - Level1 == 0            0.289   0.9999    
factor: Level3 - Level1 == 0            2.855   0.0425 *  
factor: Level4 - Level1 == 0           -6.535   <0.001 ***
factor: Level3 - Level2 == 0            1.517   0.6542    
factor: Level4 - Level2 == 0           -5.395   <0.001 ***
factor: Level4 - Level3 == 0           -8.341   <0.001 ***

This is more understandable as it is telling me which pairwise factors are significantly different. But, I'm unsure how to interpret the summary table produced by summary() and how the numbers compare to the table produced by glht().

Best Answer

The difference between the ANOVA table and the lmer summary output is that the ANOVA table reports whether at least one of the levels within your independent variable (IV) is significantly different. In your example this was the case for the IV factor as well as for the interaction factor:experiment, but it was not significant for experiment. This type of table is usually reported when performing an ANOVA, which is nothing else than a regression with categorical predictor variables and a continuous outcome.

The lmer output on the other hand (which is the same for any other linear model output), gives you the difference of each factor level (and factor level combination) compared to a baseline, which is the intercept, and tests whether this difference is significant (the test on the intercept itself simply compares whether the estimate is different from zero). By default R sorts factor levels alphabetically and will take the first level of the first IV as the intercept. The estimate for the intercept is the mean of that first level. All this can be quickly illustrated using the PlantGrowth dataset (?PlantGrowth) as a simple example:

Fit a linear model (regression):

fit <- lm(weight ~ group, data = PlantGrowth)
summary(fit)

Call:
lm(formula = weight ~ group, data = PlantGrowth)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.0710 -0.4180 -0.0060  0.2627  1.3690 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   5.0320     0.1971  25.527   <2e-16 ***
grouptrt1    -0.3710     0.2788  -1.331   0.1944    
grouptrt2     0.4940     0.2788   1.772   0.0877 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6234 on 27 degrees of freedom
Multiple R-squared:  0.2641,    Adjusted R-squared:  0.2096 
F-statistic: 4.846 on 2 and 27 DF,  p-value: 0.01591

Now calculate the means for each group:

require(plyr)
ddply(PlantGrowth,.(group), summarise, MEAN=mean(weight))

  group  MEAN
1  ctrl 5.032
2  trt1 4.661
3  trt2 5.526

Finally by using the intercept (mean value of ctrl) and the estimates you can calculate the mean values of the groups:

5.032-0.3710
[1] 4.661

5.032+0.4940
[1] 5.526

Since regression and ANOVA are the same thing (except that regression uses continuous predictor variables and compares slopes whereas ANOVA uses categorical predictor variables and compares means) we can simply display the ANOVA table by using the summary.aov() function:

summary.aov(fit)

        Df Sum Sq Mean Sq F value Pr(>F)  
group        2  3.766  1.8832   4.846 0.0159 *
Residuals   27 10.492  0.3886                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Which is the same as:

fit2 <- aov(weight ~ group, data = PlantGrowth)
summary(fit2)

            Df Sum Sq Mean Sq F value Pr(>F)  
group        2  3.766  1.8832   4.846 0.0159 *
Residuals   27 10.492  0.3886                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

From there you can go on and do multiple mean comparisons (or post-hoc tests) to see which levels of your factor(s) are significantly different, e.g.:

TukeyHSD(fit2)

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = weight ~ group, data = PlantGrowth)

$group
            diff        lwr       upr     p adj
trt1-ctrl -0.371 -1.0622161 0.3202161 0.3908711
trt2-ctrl  0.494 -0.1972161 1.1852161 0.1979960
trt2-trt1  0.865  0.1737839 1.5562161 0.0120064

Related Solutions

Solved – R lmerTest and Tests of Multiple Random Effects

The documentation on the lmerTest::rand() function is definitely terse.

From what I've been able to gather I think it tests the hypothesis that the random effect variation (i.e., a varying intercept (1 | Consumer) ) is significant versus the null that there is no between group-level variation, $H_0 : \sigma_{\alpha}^2 = 0$, where $\alpha_{j[i]} \sim N(\mu_\alpha, \sigma_\alpha^2)$ for $j = 1, \ldots, J$ is the group indicator. (I'm following Gelman & Hill's (2007) notation, see ch. 12).

I'm no expert so the code confused me a bit. Specifically, I'm not clear on what the elimRandEffs function does but I'd guess it's converting $\alpha_{j[i]}$ to a fixed (that is pooled) term $\alpha$ and then comparing this to the original model. Hopefully someone with greater knowledge can clarify this.

On the theoretical side, rand must be performing something like the test proposed in Lee and Braun 2012. However, unlike their extension to testing $0<r \leq q$ random effects at a time (sec 3.2), the rand output appears to test only one random effect at time. A simpler summary of the same idea is in slides 10-12 found here.

So your intuition that the "lmerTest is evaluating the random slope for 'sens2' [and] might also be the covariance between the slope and intercept" is correct in that rand tests if the random effect variances are significantly different from zero.

But it's incorrect to say that "the test for the random intercept is not included". The RE in your first specification:

 (1 + sens2 | Consumer)

assumes non-zero correlation between the intercept and slope, meaning they vary together and so rand() tests that specification against a no RE model (i.e., reducing to the classical regression).

The second specification

 (1  | Consumer) + (0 + sens2 | Consumer)

gives two lines of output because the random effects are additively separable. Here rand tests (in the 1st output row) a model with a pooled/fixed intercept with a random slope against your specification. In the 2nd row, the test is against a pooled sloped with a random intercept. So, like the step function, rand tests independent RE's one at a time.

I'm still puzzled by inside-R.org note that

  Note
  If the effect has random slopes, then first the correlations between itercept [sic] and slopes are checked for significance

That's not in the package documentation, so I don't know where it came from nor where such a test would be found in the output.

EDIT

I think I'm wrong about the null model in a correlated slope/intercept model as in the first specification. The step documentation says:

in the random part if correlation is present between slope and intercept, the simplified model will contain just an intercept. That is if the random part of the initial model is (1+c|f), then this model is compared to (1|f) by using LRT.

I imagine the same principle is at work for rand.

Solved – Same estimates but different p-values in tukey post hoc test (lmer)

I agree that the models have the same fitted values, but your two glht calls are testing different hypotheses. The first is a single contrast comparing the two poverty means, averaging over children; whereas the second is the set of all pairwise comparisons among the four poverty*children combinations.

The lsmeans package is helpful in providing some insight, because it makes it relatively easy to construct the needed contrasts. In particular, try this:

library(lsmeans)
library(multcomp)
summary(glht(TC1, linfct = lsm(pairwise ~ poverty*children)))

This will generate all six pairwise comparisons of the four poverty*children means, and should give the same results as you show for the interak comparisons from TC1_2. The $P$ values may vary slightly because they are computed by a simulation method with the multivariate $t$ distribution.

Alternatively, the lsmeans package can provide the same results without needing multcomp or glht:

pairs(lsmeans(TC1, ~ poverty * children), adjust = "mvt")
pairs(lsmeans(TC1_2, ~ interak), adjust = "mvt")

Best Answer

Related Solutions

Solved – R lmerTest and Tests of Multiple Random Effects

Solved – Same estimates but different p-values in tukey post hoc test (lmer)

Related Question