I have a dataset with repeated measures data but I'm unsure of the specification of my model and subsequent contrasts:
In the dataset, participants responded to a measure (Score
; continuous) at three different timepoints (Time
; T1/T2/T3) that were not equally spaced (not sure if this is relevant). I'm informed that Sex
and Age
(older/younger) may have an impact on the outcome, so should be included in the model. However, due to a relatively small sample size, and some subgroups/cells potentially being empty, I don't think it would make sense to look at complex interactions between these variables (e.g., Time*Sex*Age
) , but I would like to look at Time*Sex
and Time*Age
. So I think the following are fixed effects: Time
, Sex
, and Age
, and ID
would be a random effect and random slope.
So for issue 1: Could someone help me with the correct lme4
syntax? My initial guess is: mod <- lmer(Score ~ Time + Sex + Age + (1|ID), data=data)
. But I suspect this may be wrong because it doesn't include interactions and I'm not sure about the random aspects.
Then for issue 2: I'm hoping to use planned comparisons to look at differences in scores between T1 and T2. Then to answer a different question, I want to look at scores between T2 and T3, and T1 and T3.
I think the emmeans
package could help with this, but I'm wondering about how to involve Sex
and Age
in the comparisons, if at all. I don't have specific hypotheses that involve these factors, but if they are presumed to have an impact on scores then it makes sense to include them somehow. Would this mean intending to run planned comparisons without them, but if either or both are found to be important in the lmer model, running some kind of post-hoc tests that feature these instead?
Any help would be really appreciated.
Best Answer
First, I would make sure that you have
Time
as a factor (not just a quantitative variable) in your model.Second, I would suggest nailing down the issue of "potentially empty cells". Am I correct that you do at least have every subject measured at all 3 times? Or are there holes there? I'm hoping not, but if there are, that may not be a disaster.
I suggest doing something like
with(data, table(Sex, Age))
to see if there are missing combinations ofSex
andAge
. If so, you really can't do a sensible analysis of either factor in its own right, and you might as well combine them into one factor, saydata <- transform(data, Group = interaction(Sex, Age))
Then I'd suggest fitting the model including all interactions, because you don't know which to reasonably omit. So that'd be
Sex*Age*Time + (1|ID)
orGroup*Time + (1|ID)
depending on the previous paragraph. It does seem that if each subject is measured at the three times, then(1|ID)
is a reasonable thing to put in for the error term.For post hoc analysis, I don't see much validity to trying to pursue a small number of planned comparisons, because it all seems pretty exploratory at this point. Maybe save that for the next study that is more carefully designed and powered, and includes subjects in all age and sex combinations. By the time you finish mentioning what times you want to compare, you have covered all pairwise comparisons. So I suggest something like:
The second statement will do all three pairwise comparisons of
Time
for each combination of the other factors.It may be possible to do simpler comparisons if an anova (e.g.,
car::Anova(mod)
) suggests thatAge
orSex
does not interact withTime
. If so, you can fit a simpler model that excludes those interactions, and exclude that factor inEMM
and we will just average over the levels of that factor.