Solved – Design of matrix of contrasts in R

contrastslinear modelmatrixmixed modelr

I am doing some post-hoc comparisons (in lme4, but here I'll just present a simple linear model), and I am having a hard time making sure that I am building the right matrix of contrasts to test differences between some combinations of factors.

Here is my model:

dataset <- data.frame(response=rnorm(1200), factorX=rep(c("1", "2"), 600), factor2=rep(c("A", "B", "C"), 400))

model <- lm(response ~ factorX*factor2,data = dataset)

In my dataset, I measured a response variable depending on two factors: factorX (which has 2 levels: 1 and 2, 1 being the reference level) and factor2 (which has 3 levels: A, B and C, A being the reference level).

I get this model output:

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)
(Intercept)        0.02918    0.07183   0.406    0.685
factorX2          -0.03780    0.10158  -0.372    0.710
factor2B           0.02027    0.10158   0.200    0.842
factor2C          -0.10972    0.10158  -1.080    0.280
factorX2:factor2B  0.01409    0.14366   0.098    0.922
factorX2:factor2C  0.08925    0.14366   0.621    0.535

Question:

I want to know if there is a significant difference between some combinations of factor levels, so I build the following contrasts, and wanted to know if they corresponded to the right comparisons:

c(0, 0, 0, 0, 0, 1) => factorX1_factor2A vs. factorX2_factor2C

c(0, 0, 1, 0, -1, 0) => factorX1_factor2B vs. factorX2_factor2B

c(0, -1, -1, 0, 0, 1) => factorX2_factor2B vs. factorX2_factor2C

c(0, 0, 0, 1, 0, 0) => factorX1_factor2A vs. factorX1_factor2C

c(0, 1, 0, 0, 0, 0) => factorX1_factor2A vs. factorX2_factor2A

Is it correct?!

Thanks!

Best Answer

Your last 2 contrasts are right, but the first 3 are wrong.

We can verify this by figuring out the linear combinations of coefficients that give each group mean, and then constructing the desired contrasts by adding and subtracting these linear combinations to form new ones that test the hypotheses you're interested in.

Here are the linear combinations that give each group mean (note that I shortened "factor" to "fac" for both of the factor names):

group <- paste0(dataset$facX, dataset$fac2)
group <- aggregate(model.matrix(model) ~ group, FUN=mean)
rownames(group) <- group$group
(group <- group[,-1])
#    (Intercept) facX2 fac2B fac2C facX2:fac2B facX2:fac2C
# 1A           1     0     0     0           0           0
# 1B           1     0     1     0           0           0
# 1C           1     0     0     1           0           0
# 2A           1     1     0     0           0           0
# 2B           1     1     1     0           1           0
# 2C           1     1     0     1           0           1

So now we can construct the matrix of contrasts by adding and substracting the rows of the matrix above:

rbind(group["1A",] - group["2C",],
      group["1B",] - group["2B",],
      group["2B",] - group["2C",],
      group["1A",] - group["1C",],
      group["1A",] - group["2A",])
#     (Intercept) facX2 fac2B fac2C facX2:fac2B facX2:fac2C
# 1A            0    -1     0    -1           0          -1
# 1B            0    -1     0     0          -1           0
# 2B            0     0     1    -1           1          -1
# 1A1           0     0     0    -1           0           0
# 1A2           0    -1     0     0           0           0

As you can see, the last 2 rows match what you wrote in your question but the first 3 rows do not match.

Related Question