Solved – Design matrix contrast coding for model selection and ‘main effects’ vs. ‘simple main effects’ interpretation in linear mixed effects model (R/Matlab)

categorical-encodingcontrastsMATLABmixed modelr

My question is about contrast coding and planned contrasts in three-way interactions for a linear mixed effects model. Sample code is provided for R and Matlab as I can work in either one, but prefer Matlab.

I have an experiment with three categorical variables:

  1. Group (2 levels, between subjects)
  2. Condition A (2 levels, within subjects)
  3. Condition B (3 levels, within subjects)

The design is fully crossed (i.e. each subject is exposed to each level of B within each level of A) and the groups are balanced.

+---------+------+-------+----+----+
| Subject | Item | Group | A  | B  |
+---------+------+-------+----+----+
|       1 |    1 |     1 | A1 | B1 |
|       1 |    2 |     1 | A1 | B2 |
|       1 |    3 |     1 | A1 | B3 |
|       1 |    4 |     1 | A2 | B1 |
|       1 |    5 |     1 | A2 | B2 |
|       1 |    6 |     1 | A2 | B3 |
|       2 |    1 |     2 | A1 | B1 |
|       2 |    2 |     2 | A1 | B2 |
|       2 |    3 |     2 | A1 | B3 |
|       2 |    4 |     2 | A2 | B1 |
|       2 |    5 |     2 | A2 | B2 |
|       2 |    6 |     2 | A2 | B3 |
+---------+------+-------+----+----+

All predictor variables are coded as factors/categorical variables and ordered according to a priori hypotheses. I want to test the three-way interaction between Group, A, and B, and would like to compare B1, B2, and B3 at each level of A for each group. I fit the following model in Matlab:

lme = fitlme(data, 'respVar ~ 1 + Group*A*B + (1|Subject) + (1|Item)', 'FitMethod', 'REML', 'DummyVarCoding', 'effects', 'CheckHessian', true);

R equivalent:

library(lme4)    
contrasts(data$Group) = c(-0.5, 0.5)
    contrasts(data$A) = c(-0.5, 0.5)
    contrasts(data$B) <- cbind(c(1/2,0,-1/2), c(1/2, -1/2,0)) 

    lme = lmer(respVar ~ 1 + Group*A*B + (1|Subject) + (1|Item), control=lmerControl(optCtrl=list(maxfun=100000)), data=data)

This gives me the main effects of each parameter. However, I also want to see the simple main effects (i.e. the effect of each level of B within a fixed level of A for each group). Does it make sense to re-fit the model with treatment/dummy coding (the default in R and Matlab)? Do I then need to apply a Bonferroni correction for multiple comparisons?

Also, I am specifying a random effects structure using model selection with AIC, and the model selected differs (by one term) depending on whether I use effects coding or treatment coding. (The difference in AIC between both models with either coding method is ~2). If I want to report the results of both models, which type of coding should I use for model selection?

Best Answer

I thought I would explain what I ended up doing here in case it's helpful to anyone else.

  • Step 1: Fit the lme with effects coding

    library(MASS)
    library(lme4)
    library(psycholing)
    library(lmerTest)
    contrasts(data$Group) = contr.sum(2)
        contrasts(data$A) = contr.sum(2)
        contrasts(data$B) = contr.sum(3)
    
    lme = lmer(respVar ~ 1 + Group*A*B + (1|Subject) + (1|Item), control=lmerControl(optCtrl=list(maxfun=100000)), data=data)
    

    I performed model selection using sum coding and then tested the overall significance of each coefficient using anova from the lmerTest package:

    lmerTest::anova(lme)
    

    This gave me a significant Group x A x B three-way interaction.

  • Step 2: Switch to dummy coding and fit three models, with each level of B as the intercept.

    contrasts(data$Group) = contr.treatment(2)
        contrasts(data$A) = contr.treatment(2)
    contrasts(data$B) = contr.treatment(3)
    
    # N.b. these are the default contrasts in R. contrasts(data$B)
    #       B2    B3
    # B1     0     0
    # B2     1     0
    # B3     0     1
    
    lmeB1 = lmer(respVar ~ 1 + Group*A*B + (1|Subject) + (1|Item), control=lmerControl(optCtrl=list(maxfun=100000)), data=data) 
    b1sum = lmerTest::summary(lmeB1)
    
    relevel(data$B, "B2") 
    lmeB2 = lmer(respVar ~ 1 + Group*A*B + (1|Subject) + (1|Item), control=lmerControl(optCtrl=list(maxfun=100000)), data=data) 
    b2sum = lmerTest::summary(lmeB2)
    
    relevel(data$B, "B3") 
    lmeB3 = lmer(respVar ~ 1 + Group*A*B + (1|Subject) + (1|Item), control=lmerControl(optCtrl=list(maxfun=100000)), data=data) 
    b3sum = lmerTest::summary(lmeB3)
    
  • Step 3: Extract the contrasts of interest and apply a Bonferroni-Holm correction for multiple comparisons.

    # Test the contrasts: 
    #  1) Group1 A1 B1 vs. Group1 A1 B2/B3
    #  2) Group1 A1 B1/B2/B3 vs. Group2 A1 B1/B2/B3
    #  3) Group1 A1 B1/B2/B3 vs. Group1 A2 B1/B2/B3
    
    pvals = cbind("B1"=p.adjust(b1sum$coefficients[c(12, 11, 8, 3), 5], "holm"),  
                  "B2"=c(9,9,p.adjust(b2sum$coefficients[c(8, 3), 5],"holm")),  
              "B3"=c(9,9,p.adjust(b3sum$coefficients[c(8, 3), 5],"holm")))
    
    # Numbers correspond to the rows with the coefficients of interest in model$coefficients, column 5 contains the p-values.
    
    # Reference Level=Group1
    #                        B1           B2        B3
    # 1a) B2:A1     0.001707473 
    # 1b) B3:A1     0.027679733 
    # 2)  Group2:A2 0.016903682 0.0328017681 0.9451504
    # 3)  A2        0.127490731 0.0008424514 0.1002219
    

Note that I did this in R because I also included a fixed effect for participant gender, which I coded as c(0.5, -0.5) to centre the estimates on the mean of both (effectively "controlling for" gender). This is easier to do in R with the contrasts function: in MATLAB, it seems you have to specify the entire design matrix manually if you want to use something other than effects or dummy coding.

If you don't need custom contrasts, this whole process can be done much more easily in MATLAB by fitting the model with the default (dummy) variable coding:

lme = fitlme(data, 'respVar ~ 1 + Group*A*B + (1|Subject) + (1|Item)', 'FitMethod', 'REML', 'CheckHessian', true);

Then use coefTest to specific contrast matrices for your coefficients. The following gives me an F test for the contrast between my second and third coefficients---B2 and B3 in this case---with a Satterthwaite approximation for degrees of freedom. (See this reference for a discussion of significance testing for LMEs: https://doi.org/10.3758/s13428-016-0809-y)

[pval,F,DF1,DF2]=coefTest(lme, [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0; 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'DFMethod', 'Satterthwaite')
Related Question