Solved – Post-hoc tests on linear mixed model give mixed results.‏

mixed modelpost-hocrtukey-hsd-test

I am quite new to R so apologies if I fail to ask properly. I have done a test comparing bat species richness in five habitats as assessed by three methods. I used a linear mixed model in lme4 and got habitat, method and the interaction between the two as significant, with the random effects explaining little variation.

I then ran Tukey's post hoc tests as pairwise comparisons in three ways:

Firstly in lsmeans:

lsmeans(LMM.richness, pairwise~Habitat*Method, adjust="tukey")

Then in ‘agricolae’:

tx <- with(diversity, interaction(Method, Habitat))
amod <- aov(Richness ~ tx, data=diversity)
library(agricolae)
interaction <-HSD.test(amod, "tx", group=TRUE)
interaction

Then in ghlt 'multcomp':

summary(glht(LMM.richness, linfct=mcp(Habitat="Tukey")))
summary(glht(LMM.richness, linfct=mcp(Method="Tukey")))

tuk <- glht(amod, linfct = mcp(tx = "Tukey"))
summary(tuk)          # standard display
tuk.cld <- cld(tuk)   # letter-based display
opar <- par(mai=c(1,1,1.5,1))
par(mfrow=c(1,1))
plot(tuk.cld)
par(opar)

I got somewhat different levels of significance from each method, with ghlt giving me the greatest number of significant results and lsmeans the least. When I ran this on a generalized linear model of abundance data from the same study, ghlt was then the most conservative. All the results from all packages make sense based on the graphs of the data.

Can anyone tell me if there are underlying reasons why these tests might be more or less conservative, whether in any case I have failed to specify anything correctly or whether any of these tests are not suitable for linear mixed models? It seems from the code to me that the 'agricolae' and 'multcomp' approaches fit their own model rather than the one I have specified, but I found these codes as suggested post-hoc tests on forums.

Best Answer

I just saw this, hence the very late answer. I can't answer about agricolae, but in lsmeans you did comparisons of all 15 means for the combinations of the two factors - that's 15*14/2 = 105 comparisons altogether, and in multcomp you requested two much smaller sets of comparisons, one the 10 comparisons of 5 means and the 3 comparisons of 3 means. Since the lsmeans tests cover a MUCH larger family of comparisons, and you use an adjustment for multiplicity (Tukey), it's bound to be much more conservative. If you had done separate lsmeans calls with pairwise~Habitat and pairwise~Method, you would have gotten comparable results, though probably a bit different if the data are unbalanced.