I don't see too much to say here. I think you have done a good job.
There are several ways that people have discussed to test effects and get p-values for complicated mixed effects models. There is a good overview here. The best is to use computationally intensive methods (bootstrapping or Bayesian methods), but this is more advanced for most people. The second best (and best convenient) method is to use a likelihood ratio test. That is what anova()
(technically ?anova.merMod()) is doing. It is important to only use a likelihood ratio test of models that were fit with the full maximum likelihood, rather than restricted maximum likelihood (REML). On the other hand, for your final model, and for interpretation, you want to use REML. This is confusing for many people. In your output, we see that you fit your models with REML (this is because the option is set to TRUE
by default in lmer()
. That would mean that your test is invalid, however, because this is such a common mistake, anova.merMod()
contains a refit
argument which by default is set to TRUE
, and you didn't change it. So the foresight of the package developers has saved you there.
Regarding your strategy for unpacking the interaction, what you did is fine. Just bear in mind that the interaction uses all the data for its test. It is possible to have a significant interaction but have neither of the stratified tests be significant, which confuses some people. (It doesn't seem to have happened to you, though.)
Using Jake Westfall's EMS
function documented HERE, I estimated the variance components from the mixed model. The random variances above that were estimated to be 0 are indeed negative as derived by EMS.
cbind(rev(ans[c(grep("e",names(ans)),grep("s",names(ans)))])/
c(1,2,2,2,4,4,4,5,1))
s 69.838600
a:s 7.711213
b:s -1.423696
c:s -1.098723
a:b:s -1.498620
a:c:s -1.212590
b:c:s -1.244919
a:b:c:s -2.650683
e 922.447967
When running lmer.4
on the aggregated data-set, lmer
gives identical results (except for the 3-way interaction, but it is still very similar).
> lmer.4 <- lmer(rating ~ timepoint_n*att_cond_n*gng_cond_n +(1|subjectID) + (0 + timepoint_n | subjectID)
(0 + att_cond_n | subjectID) + (0+gng_cond_n |subjectID) + (0+timepoint_n:att_cond_n|subjectID) +
(0+timepoint_n:gng_cond_n|subjectID) + (0 + att_cond_n:gng_cond_n|subjectID),
data = d_aggr, control=lmerControl(optCtrl=list(maxfun=1e9)))
> summary(lmer.4)
Linear mixed model fit by REML
t-tests use Satterthwaite approximations to degrees of freedom ['lmerMod']
Formula: rating ~ timepoint_n * att_cond_n * gng_cond_n + (1 | subjectID) +
(0 + timepoint_n | subjectID) + (0 + att_cond_n | subjectID) +
(0 + gng_cond_n | subjectID) + (0 + timepoint_n:att_cond_n |
subjectID) + (0 + timepoint_n:gng_cond_n | subjectID) + (0 + att_cond_n:gng_cond_n | subjectID)
Data: d_aggr
Control: lmerControl(optCtrl = list(maxfun = 1e+09))
REML criterion at convergence: 2438.7
Scaled residuals:
Min 1Q Median 3Q Max
-4.0726 -0.1788 -0.0138 0.2457 4.3407
Random effects:
Groups Name Variance Std.Dev.
subjectID (Intercept) 2.850e+02 1.688e+01
subjectID.1 timepoint_n 3.651e+01 6.042e+00
subjectID.2 att_cond_n 2.257e-11 4.750e-06
subjectID.3 gng_cond_n 1.269e+00 1.126e+00
subjectID.4 timepoint_n:att_cond_n 0.000e+00 0.000e+00
subjectID.5 timepoint_n:gng_cond_n 8.132e-01 9.018e-01
subjectID.6 att_cond_n:gng_cond_n 6.839e-01 8.270e-01
Residual 4.694e+01 6.851e+00
Number of obs: 328, groups: subjectID, 41
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 140.88293 2.66360 40.00000 52.892 < 2e-16 ***
timepoint_n -6.92561 1.01664 40.00000 -6.812 3.43e-08 ***
att_cond_n -0.13354 0.37828 120.00000 -0.353 0.72470
gng_cond_n 1.35854 0.41718 40.00000 3.256 0.00230 **
timepoint_n:att_cond_n -0.02744 0.37828 120.00000 -0.073 0.94230
timepoint_n:gng_cond_n 1.36220 0.40365 40.00000 3.375 0.00165 **
att_cond_n:gng_cond_n 1.00549 0.39972 40.00000 2.515 0.01601 *
timepoint_n:att_cond_n:gng_cond_n 0.98963 0.37828 120.00000 2.616 0.01004 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) tmpnt_ att_c_ gng_c_ tmpnt_n:t__ tmpnt_n:g__ a__:__
timepoint_n 0.000
att_cond_n 0.000 0.000
gng_cond_n 0.000 0.000 0.000
tmpnt_n:t__ 0.000 0.000 0.000 0.000
tmpnt_n:g__ 0.000 0.000 0.000 0.000 0.000
att_cnd_:__ 0.000 0.000 0.000 0.000 0.000 0.000
tmpn_:__:__ 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Therefore, it seems that as lmer
cannot achieve the fit of aov
by not being able/allowed to estimate negative variance components, the results differ, while the difference is gone when running the lmer
representation of the model on the aggregated data, eliminating the negatively estimated variance components.
Best Answer
The colon
:
indicates a multiplicative term (an interaction).The asterisk
*
automatically includes both the interaction and the main effects:a * b
is interpreted the same way asa + b + a:b
(See https://stackoverflow.com/questions/40567421/vs-in-r-for-modelling)You typically should start with the asterisk
a * b
or the full expression (a + b + a:b
), but there may be occasional exceptions -- see Including the interaction but not the main effects in a model