One major benefit of mixed-effects models is that they don't assume independence amongst observations, and there can be a correlated observations within a unit or cluster.
This is covered concisely in "Modern Applied Statistics with S" (MASS) in the first section of chapter 10 on "Random and Mixed Effects". V&R walk through an example with gasoline data comparing ANOVA and lme in that section, so it's a good overview. The R function to be used in lme
in the nlme
package.
The model formulation is based on Laird and Ware (1982), so you can refer to that as a primary source although it's certainly not good for an introduction.
- Laird, N.M. and Ware, J.H. (1982) "Random-Effects Models for Longitudinal Data", Biometrics, 38, 963–974.
- Venables, W.N. and Ripley, B.D. (2002) "Modern Applied Statistics with S", 4th Edition, Springer-Verlag.
You can also have a look at the "Linear Mixed Models" (PDF) appendix to John Fox's "An R and S-PLUS Companion to Applied Regression". And this lecture by Roger Levy (PDF) discusses mixed effects models w.r.t. a multivariate normal distribution.
I think that lmerTest
is getting it right and ezanova
is getting it wrong in this case.
- the results from
lmerTest
agree with my intuition/understanding
- two different computations in
lmerTest
(Satterthwaite and Kenward-Roger) agree
- they also agree with
nlme::lme
- when I run it,
ezanova
gives a warning, which I don't entirely understand, but which should not be disregarded ...
Re-running example:
library(ez); library(lmerTest); library(nlme)
data(ANT)
ANT.2 <- subset(ANT, !error)
set.seed(101) ## for reproducibility
baseline.shift <- rnorm(length(unique(ANT.2$subnum)), 0, 50)
ANT.2$rt <- ANT.2$rt + baseline.shift[as.numeric(ANT.2$subnum)]
Figure out experimental design
with(ANT.2,table(subnum,group,direction))
So it looks like individuals (subnum
) are placed in either control or treatment groups, and each is tested for both directions -- i.e. direction can be tested within individuals (denominator df is large), but group and group:direction can only be tested among individuals
(anova.ez <- ezANOVA(data = ANT.2, dv = .(rt), wid = .(subnum),
within = .(direction), between = .(group)))
## $ANOVA
## Effect DFn DFd F p p<.05 ges
## 2 group 1 18 2.4290721 0.13651174 0.1183150147
## 3 direction 1 18 0.9160571 0.35119193 0.0002852171
## 4 group:direction 1 18 4.9169156 0.03970473 * 0.0015289914
Here I get Warning: collapsing data to cell means. *IF* the requested effects are a subset of the full design, you must use the "within_full" argument, else results may be inaccurate.
The denominator DF look a little funky (all equal to 18): I think they should be larger for direction and group:direction, which can be tested independently (but would be smaller if you added (direction|subnum)
to the model)?
# similarly with lmer and lmerTest::anova
model <- lmer(rt ~ group * direction + (1 | subnum), data = ANT.2)
lmerTest::anova(model)
## Df Sum Sq Mean Sq F value Denom Pr(>F)
## group 1 12065.7 12065.7 2.4310 18 0.1364
## direction 1 1952.2 1952.2 0.3948 5169 0.5298
## group:direction 1 11552.2 11552.2 2.3299 5169 0.1270
the Df
column here refers to the numerator df, Denom
(second-to-last) gives the estimated denominator df; they agree with the classical intuition. More important, we also get different answers for the F values ...
We can also double-check with Kenward-Roger (very slow because it involves refitting the model several times)
lmerTest::anova(model,ddf="Kenward-Roger")
The results are identical.
For this example lme
(from the nlme
package) actually does a perfectly good job guessing the appropriate denominator df (the F and p-values are very slightly different):
model3 <- lme(rt ~ group * direction, random=~1|subnum, data = ANT.2)
anova(model3)[-1,]
## numDF denDF F-value p-value
## group 1 18 2.4334314 0.1362
## direction 1 5169 0.3937316 0.5304
## group:direction 1 5169 2.3298847 0.1270
If I fit an interaction between direction
and subnum
the df for direction
and group:direction
are much smaller (I would have thought they would be 18, but maybe I'm getting something wrong):
model2 <- lmer(rt ~ group * direction + (direction | subnum), data = ANT.2)
lmerTest::anova(model2)
## Df Sum Sq Mean Sq F value Denom Pr(>F)
## group 1 20334.7 20334.7 2.4302 17.995 0.1364
## direction 1 1804.3 1804.3 0.3649 124.784 0.5469
## group:direction 1 10616.6 10616.6 2.1418 124.784 0.1459
Best Answer
This is an interesting question... The approach for calculating predictive r-square for linear models is given at this rpubs page. It won't work directly for mixed models. There is an influence function in the
car
package for mixed models, but I couldn't figure out how to adapt this to the purpose.... If I understand predictive r-squared, probably the most fruitful approach would be to write a function that removes data observation by observation and sees how well the model predicts the dropped observation. I didn't see anything that addresses how this would be done specifically.