Multiple Comparisons – Performing Multiple Comparisons After Adjusting for Covariates

multiple regressionmultiple-comparisonsp-valuerregression

To describe this issue, we consider the following 2 models:

m1 <- lm(disp ~ drat, mtcars)
m2 <- lm(disp ~ drat + factor(cyl), mtcars)

drat is a continuous variable. cyl is a categorical variable with levels '4', '6', '8', and I treat '4' as the baseline, so there will be 2 dummy variables $D_1$ for cyl=6 & $D_2$ for cyl=8 respectively. The regression model is:

$$disp = \beta_0 + \beta_1 drat + \beta_{cyl=6} D_1 + \beta_{cyl=8} D_2 + \epsilon$$

I want to know if cyl is correlated to disp after controlling drat. I perform an F-test where the null hypothesis is

$$H_0: \beta_{cyl=6} = \beta_{cyl=8} = 0.$$

anova(m1, m2)

# Model 1: disp ~ drat
# Model 2: disp ~ drat + factor(cyl)
#   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
# 1     30 235995                                  
# 2     28  70246  2    165750 33.034 4.286e-08 ***

The p-value ($\approx$ 0) implies that we can reject the $H_0$ statement. Now I wonder which pairs of cyl have significant difference in disp. The following is the model summary table:

summary(m2)

# Coefficients:
#              Estimate Std. Error t value Pr(>|t|)    
# (Intercept)    266.28      97.32   2.736   0.0107 *  
# drat           -39.58      23.62  -1.676   0.1048    
# factor(cyl)6    58.97      26.79   2.201   0.0361 *  
# factor(cyl)8   214.65      28.33   7.578 2.96e-08 ***

I have 2 main questions:

  1. From the table above, the p-value of $\beta_\rm{cyl=6}$ is $0.036 < 0.05$. Could I interpret this value directly and conclude:

    The mean disp with cyl=6 is different from that with cyl=4 (baseline) after controlling drat.

    Or otherwise, should I adjust the p-value to avoid inflation of the type I error? E.g. multiply it by 3 (using Bonferroni correction) to get p-value $= 0.036 \times 3 = 0.108 > 0.05$, and hence conclude:

    The mean disp with cyl=6 is the same as that with cyl=4 (baseline) after controlling drat.

  2. How to perform the multiple comparisons on each pair of cyl after controlling drat? In R, pairwise.t.test() doesn't seem to handle this.


Best Answer

Regarding both pairwise comparisons and adjustment for multiple comparisons, doesn't emmeans solution work?

library(emmeans)
em1<-emmeans(m2, ~cyl) #this gives you estimated marginal means of disp at different levels of cyl from the model m2
em1
contrast(em1, "pairwise", simple="each", combine=T, adjust="sidak") 

#this gives you all pairwise comparisons with Sidak's multiple comparison adjustment.
#you can also choose other adjustments such as bonferroni but sidak is probably most commonly used
#You can also choose adjust="none" in which case p-values are not adjusted (and you see that the comparison values in the emmeans contrasts between cyl=4 and other groups are identical to the regression summary table).

Re:

From the table above, the p-value of $βcyl=6$ is 0.036<0.05 Could I interpret this value directly and conclude: "The mean disp with cyl=6 is different from that with cyl=4 (baseline) after controlling drat"

Yes, but for those values, no p-value adjustment for multiple comparisons has been applied. This comparison is actually no longer significant after applying sidak correction.

Related Question