Solved – Which multiple comparison method to use for a lmer model: lsmeans or glht

lsmeansmultiple-comparisonspost-hocrrepeated measures

I'm analyzing a data set using a mixed effects model with one fixed effect (condition) and two random effects (participant due to the within subject design and pair). The model was generated with the lme4 package: exp.model<-lmer(outcome~condition+(1|participant)+(1|pair),data=exp).

Next, I performed a likelihood ratio test of this model against the model without the fixed effect (condition) and have a significant difference. There are 3 conditions in my data set so I want to do a multiple comparison but I am not sure which method to use. I found a number of similar questions on CrossValidated and other forums but I am still quite confused.

From what I've seen, people have suggested using

1. The lsmeans package – lsmeans(exp.model,pairwise~condition) which gives me the following output:

condition     lsmean         SE    df  lower.CL  upper.CL
 Condition1 0.6538060 0.03272705 47.98 0.5880030 0.7196089
 Condition2 0.7027413 0.03272705 47.98 0.6369384 0.7685443
 Condition3 0.7580522 0.03272705 47.98 0.6922493 0.8238552

Confidence level used: 0.95 

$contrasts
 contrast                   estimate         SE    df t.ratio p.value
 Condition1 - Condition2 -0.04893538 0.03813262 62.07  -1.283  0.4099
 Condition1 - Condition3 -0.10424628 0.03813262 62.07  -2.734  0.0219
 Condition2 - Condition3 -0.05531090 0.03813262 62.07  -1.450  0.3217

P value adjustment: tukey method for comparing a family of 3 estimates

2. The multcomp package in two different ways – using mcp glht(exp.model,mcp(condition="Tukey")) resulting in

     Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Fit: lmer(formula = outcome ~ condition + (1 | participant) + (1 | pair), 
    data = exp, REML = FALSE)

Linear Hypotheses:
                             Estimate Std. Error z value Pr(>|z|)  
Condition2 - Condition1 == 0  0.04894    0.03749   1.305    0.392  
Condition3 - Condition1 == 0  0.10425    0.03749   2.781    0.015 *
Condition3 - Condition2 == 0  0.05531    0.03749   1.475    0.303  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)

and using lsm glht(exp.model,lsm(pairwise~condition)) resulting in

Note: df set to 62

     Simultaneous Tests for General Linear Hypotheses

Fit: lmer(formula = outcome ~ condition + (1 | participant) + (1 | pair), 
    data = exp, REML = FALSE)

Linear Hypotheses:
                             Estimate Std. Error t value Pr(>|t|)  
Condition1 - Condition2 == 0 -0.04894    0.03749  -1.305   0.3977  
Condition1 - Condition3 == 0 -0.10425    0.03749  -2.781   0.0195 *
Condition2 - Condition3 == 0 -0.05531    0.03749  -1.475   0.3098  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)

As you can see, the methods give different results. This is my first time working with R and stats so something might be going wrong but I wouldn't know where. My questions are:

What are the differences between the presented methods? I read in an answer to a related questions that it's about the degrees of freedom (lsmeans vs. glht).
Are there some rules or recommendations when to use which one, i.e., method 1 is good for this type of data set/model etc.? Which result should I report? Without knowing better I'd probably just go and report the highest p-value I got to play it safe but it would be nice to have a better reason. Thanks

Best Answer

Not a complete answer...

The difference between glht(myfit, mcp(myfactor="Tukey")) and the two other methods is that this way uses a "z" statistic (normal distribution), whereas the other ones use a "t" statistic (Student distribution). The "z" statistic it the same as a "t" statistic with an infinite degree of freedom. This method is an asymptotic one and it provides smaller p-values and shorter confidence intervals than the other ones. The p-values can be too small and the confidence intervals can be too short if the dataset is small.

When I run lsmeans(myfit, pairwise~myfactor) the following message appears:

Loading required namespace: pbkrtest

That means that lsmeans (for a lmer model) uses the pbkrtest package which implements the Kenward & Rogers method for the degrees of freedom of the "t" statistic. This method intends to provide better p-values and confidence intervals than the asymptotic one (there's no difference when the degree of freedom is large).

Now, about the difference between lsmeans(myfit, pairwise~myfactor)$contrasts and glht(myfit, lsm(pairwise~factor), I have just done some tests and my observations are the following ones:

lsm is an interface between the lsmeans package and the multcomp package (see ?lsm)
for a balanced design there's no difference between the results
for an unbalanced design, I observed small differences between the results (the standard errors and the t ratio)

Unfortunately I do not know what is the cause of these differences. It looks like lsm calls lsmeans only to get the linear hypotheses matrix and the degrees of freedom, but lsmeans uses a different way to calculate the standard errors.

Related Solutions

Solved – Which post-hoc is more valid for multiple comparison of an unbalanced lmer-model: lsm or mcp

Are you sure that the results really differ at all? I see only one change in the 4th decimal place (of the p-value of the difference between groups 2 and 3), a relative difference of 0.3%, which could be a numerical difference due to doing equivalent computations in a different sequence.

?lsm says:

It works similarly to ‘mcp’ except with ‘specs’ (and optionally ‘by’ and ‘contr’ arguments) provided as in a call to ‘lsmeans’.

which suggests strongly to me (since lsmeans is generally well-documented) that this is only a different interface to the same functionality: if there were important statistical differences I think they would be mentioned ...

It would be helpful to tell us that lsm comes from the lsmeans package (and glht is from multcomp).

Solved – Adjustment of p-values for multiple-comparisons in lsmeans

You can exert more control over what you get if you do the contrasts in a separate step:

lsm = lsmeans(model, ~ factor1 | factor2*factor3)
comps = pairs(lsm)
summary(comps, by = NULL, adjust = "holm")

The by = NULL part of the last statement tells it to ignore the by grouping and thus treat it as one family of tests.

To answer part of the first question, yes, it does inflate the overall type I error, if by that you mean the whole family of 42 tests. On the other hand, I could ask the same question about the results of your one-shot command that includes the least-squares means (I think there are $3\times42=126$ of them) and the 42 pairwise comparisons you requested, for a total of 168 tests: Is it perhaps necessary to protect the overall type I error for that set of 168 tests?

There is always a choice of what family of tests you are applying a multiplicity adjustment to. I am not sure if the per-group adjustment is the most reasonable; I think it is what most users expect, and that is why it is the default. At any rate, one of the design decisions was to try to state explicitly how the adjustments are made, so people don't make the wrong assumption about what they see.

Any comments of whether that is the right default?

Best Answer

Related Solutions

Solved – Which post-hoc is more valid for multiple comparison of an unbalanced lmer-model: lsm or mcp

Solved – Adjustment of p-values for multiple-comparisons in lsmeans

Related Question