I have a rookie question about emmeans in R.
I fit a complex model using lmer() with the following variables:
- A: a binary categorical predictor, within-subject
- B: a binary categorical predictor, within-subject
- C: a categorical predictor with 4 levels, between-subject
- X & Y: control variables of no interest, one categorical, one continuous.
The model is as follows:
fit1 <- lmer(rt ~ 1 + A*B*C + X + Y + (1+A*B|Subject))
Now I'm mostly interested in how the A*B interaction differs across different levels of C (i.e., is the interaction different across the four groups I have). I was trying to use emmeans to get to the bottom of this, and I have found some very useful threads here on CrossValidated, but I cannot seem to find one that I can generalize easily to my case.
Here's what I did: I created a new model with an interaction term (AB = A*B).
fit1b <- lmer(rt ~ 1 + A*C + B*C + AB*C + X + Y + (1+A*B|Subject))
Then used emmeans like this:
emms <- emmeans(fit1b, ~ AB*C)
contrast(emms, interaction = "pairwise")
This results in an output that seems to make sense, however, I'm really uncertain about whether this setup makes any sense to begin with. Essentially my goal is to be able to determine whether the A*B interaction in greater in group x compared to group y, while controlling for all the other stuff in the model.
Is this a good way of doing this? Is there an easier/nicer way to do this?
EDIT: I created a simulated data set – here it is: https://osf.io/4cr8x
A is actually congruency in a conflict task, and B is previous trial congruency that's why the first trials have no value in that column. C is the group variable, just like in the example above. X and Y are the controls, with X being trial number and Y being sex.
EDIT 2: And here's the exact code I ran on the simulated data:
library(lme4)
library(lmerTest)
library(emmeans)
Data <- read.csv("simdat.csv",header=TRUE, sep=",", na.strings="-999", dec=".", strip.white=TRUE)
Data$A <- as.factor(Data$A)
Data$B <- as.factor(Data$B)
Data$C <- as.factor(Data$C)
Data$Y <- as.factor(Data$Y)
Data$Subject <- as.factor(Data$subject)
fit1 <- lmer(rt ~ 1 + A*B*C + X + Y + (1|Subject), data = Data, verbose = 0, REML = F) #I simplified the random structure as the original wouldn't converge with the simulated data
interaction_term <- (as.numeric(levels(Data$A))[Data$A])*(as.numeric(levels(Data$B))[Data$B])
Data$AB <- as.factor(interaction_term)
fit2 <- lmer(rt ~ 1 + A*C + B*C + AB*C + X + Y + (1|Subject), data = Data, verbose = 0, REML = F)
emms <- emmeans(fit2, ~ AB*C)
contrast(emms, interaction = "pairwise")
Best Answer
It shouldn't be necessary to fit a separate model just to do the post-hoc comparisons you want. You had tried:
but you can get the same results from the original model using
by
variables judiciously:The
con1
results are the desired 1-d.f. interaction effects for each level ofC
(theby
factor is remembered). Then we compare them pairwise, no longer using theby
grouping. By default, a Tukey adjustment is made to the family of comparisons, but you may use a different method viaadjust
.