Solved – How to calculate Tukey-adjusted p-values for emmeans pairwise comparisons

lsmeansp-valuetukey-hsd-test

I would like to calculate Tukey-adjusted p-values for emmeans pairwise comparisons. I know that these can be obtained directly with functions like pairs() and CLD(). However, when there are three leading zeroes in the p-value, only one digit is displayed. I recognize that in this case the significance of the test statistic is not in question, but I like to consistently report p-values with two digits in papers.

In attempting to calculate a more precise p-value based on the output of pairs(), I have not been able to figure out how to do this when there are multiple comparisons. It's straightforward when there is just one comparison:

> pairs(emmeans(model1, "harvest"), details = T)  
 contrast              estimate        SE df t.ratio p.value  
 Spring - Spring/Fall 0.4521333 0.1006861 15   4.491  0.0004  

> 2*pt(4.491, 15, lower=FALSE)  
[1] 0.0004309609

However, when there are multiple comparisons, I can't figure out how to calculate the appropriate Tukey-adjusted p-value. An unadjusted p-value is too low and an adjusted p-value is too high (using the contrast between factor levels 15 and 61 as an example).

> pairs(emmeans(model2, "row.space"))  
 contrast  estimate        SE df t.ratio p.value  
 15 - 30  0.1979111 0.1034653 62   1.913  0.1436  
 15 - 61  0.4199143 0.1034653 62   4.059  0.0004  
 30 - 61  0.2220032 0.1034653 62   2.146  0.0890  

P value adjustment: tukey method for comparing a family of 3 estimates  


> 2*pt(4.059, 62, lower=FALSE)  # too low  
[1] 0.0001405038  

> 2*ptukey(4.059, nmeans = 3, df = 62, lower=FALSE)  # too high  
[1] 0.03053126

How should I calculate the Tukey-adjusted p-value so that I can obtain more digits?

Best Answer

Tukey-adjusted P values are computed using the ptukey() function in R (Studentized range distribution). This is one of the toughest distributions to compute, among those in common use. The help page for ptukey states:

Note

A Legendre 16-point formula is used for the integral of ptukey. The computations are relatively expensive, especially for qtukey which uses a simple secant method for finding the inverse of ptukey. qtukey will be accurate to the 4th decimal place.

This means that the probabilities are computed using numerical integration. I really doubt that a P value way out in the tail can be computed to many digits' accuracy. I suggest that if you have a habit of reporting P values to 2 digits accuracy, it is a habit worth breaking. Nobody needs to distinguish between P = .00022 and P = .00024, even if 2 digits accuracy is believable.

However, as is suggested in a comment, if you do summary(pairs(...)) or as.data.frame(pairs(...)), you get something that inherits from data.frame, and every P value can be extracted to 9 or so digits. But it is false precision.

Also, here is the correct calculation of that P value:

> 1 - ptukey(4.059 * sqrt(2), 3, 62)
[1] 0.0004082944

Actually, the actual result is somewhere in a range depending on the rounding of $t$:

> 1 - ptukey(4.0595 * sqrt(2), 3, 62)
[1] 0.0004076112

> 1 - ptukey(4.0585 * sqrt(2), 3, 62)
[1] 0.0004089787

The $\sqrt{2}$ factor is needed because the standardization is based on the SE of one mean rather than the difference of two means; and the Studentized range test is a one-sided test based on the maximum minus the minimum.

Related Solutions

Solved – emmeans pairwise contrasts result in same output values for all

You have fitted an additive model - the fixed-effects part is condition + location. Therefore you have in fact specified that the differences for one factor are exactly the same at each level of the other factor. Since emmeans() summarizes a model, then, lo and behold, the results reflect what is specified.

If instead you include the interaction between condition and location in the model, then the emmeans() results will reflect the possibility that factor levels compare differently at levels of the other factor.

I recommend that people think more carefully about the models they are fitting. I think there is a tendency to rush forward without realizing what s crucial thing it is to get the model right.

Solved – Pairwise comparisons via emmeans

Your reviewer has a good point, because you have both between- and within-subject comparisons in your collection, and those have different standard errors.

But the other thing I notice is that there are a total of $2\times2\times3=12$ cell means, and the emmeans() call shown produces all ${12\choose2}=66$ possible comparisons. I truly wonder if all of those are really of interest. If you do, I suggest trying the "mvt" adjustment, which is the same idea as Tukey but is based on the multivariate $t$ distribution with the actual correlation structure in your model (which in this case is not the same as the correlation structure assumed by the Tukey method).

However, often when there is an interaction, people opt for "simple" comparisons -- in this case, comparing the levels of one factor while holding the other two fixed. Those are easily done via

emm <- emmeans(model, ~ A * B * C)
simp <- pairs(emm, simple = "each")
simp

This will yield 6 comparisons of the levels of A, 6 comparisons of the two levels of B, and 4 sets of 3 comparisons among the levels of C, for a total of 24 comparisons instead of 66. Moreover, the issues of Tukey being inappropriate go away, because each set of simple comparisons is homogeneous.

Some additional comments:

While REML = FALSE is often a good idea for testing your model against other models, I recommend refitting your model with REML = TRUE before proceding to post hoc comparisons, because the REML method reduces bias in the estimates.
Consider using meaningful names in place of A, B, and C. We're not doing a generic math problem here; we are trying to tell a useful story about an actual experiment. Meaningful names help everybody understand what you're talking about.
I suggest doing things in steps, as shown above, over trying to get every result you want in one R call. That makes it more natural to focus on particular results or go in different directions; e.g., summary(emm) shows the 12 cell means and pairs(emm, by = “C”) could be used to compare the four A:B combinations at each level of C.
You might want to do a stronger multiplicity correction rather than a separate one for each of the comparisons. For example, test(simp[[1]], by = NULL, adjust = "mvt") puts all 6 of the A comparisons in one family and applies the multivariate $t$ adjustment. (The Tukey adjustment is completely inappropriate for that because it is not a set of pairwise comparisons; in fact, the software doesn't even allow that adjustment.)

Best Answer

Related Solutions

Solved – emmeans pairwise contrasts result in same output values for all

Solved – Pairwise comparisons via emmeans

Related Question