Solved – Why are degrees of freedom so high in repeated measures mixed models

anovadegrees of freedommixed modelrepeated measures

I have been running a repeated measures mixed model in SAS and the denominator degrees of freedom for the F statistic seems to be really high. In this case I have roughly 200 individuals and three measurements for each individual (the repeated statement) and degrees of freedom are over 500. So it seems to be roughly reflecting the number of observations. I checked a few articles and found the same pattern elsewhere, but as a stats beginner I'm not sure if it is ok; or is it pseudoreplication? I also noted that changing the covariate structure has a large effect on degrees of freedom.

Why are degrees of freedom so high in such designs, or is it likely that there is something seriously wrong with my model?
What might be some suggested reading on repeated measures mixed models to help me understand this?

Best Answer

Unless I'm reading you incorrectly you would have approximately this many degrees of freedom in a standard repeated measures ANOVA. The ANOVA error is the interaction between subjects and the effect and requires degrees of freedom from both. However, if you are making multiple measures at each time interval then yes, the degrees for freedom are much higher for the mixed effects model.

It's not the standard pseudoreplication problem you'd have if you were doing a repeated measures ANOVA. With the ANOVA you are only modelling the effects in question and should aggregate your data to get better estimates of each effect for each S. With mixed effects modelling you are potentially modelling each data point, even those pseudoreplicated measurements you take for accuracies sake. You can do that because you're explicitly saying these individual measures are grouped within subjects and within this factor, etc. Therefore, you can often have more degrees of freedom. Although, as I stated before, it doesn't sound like that's the case here anyway.

Related Solutions

Solved – Example reports for mixed-model analysis using lmer in biology, psychology and medicine

Update 3 (May, 2013): Another really good paper on mixed models in Psychology was released in the Journal of Memory and Language (although I do not agree with the authors conclusions on how to obtain p-values, see package afex instead). It very nicely discusses on how to specify the random effects structure. Go read it!

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. doi:10.1016/j.jml.2012.11.001

Update 2 (July, 2012): A paper advocating the use in (Social) Psychology when there are crossed (e.g., participants and items) random effects.
The big thing is: It shows how to obtain p-values using the pbkrtest package:

Judd, C. M., Westfall, J., & Kenny, D. A. (2012). Treating stimuli as a random factor in social psychology: A new and comprehensive solution to a pervasive but largely ignored problem. Journal of Personality and Social Psychology, 103(1), 54–69. doi:10.1037/a0028347
(only available as a Word .doc)

Jake Westfall told me (per mail) that an alternative for obtaining p-values to the advocated Kenward-Rogers approximation (used in pbkrtest) is the (less optimal) Satterthwaite approximation, which can be found in the MixMod package using the anovaTab function.

Small update to last update: My R package afex contains function mixed() to conveniently obtain p-values for all effects in a mixed model. Alternatively, the car package now also obtains p-values for mixed models in Anova() using test.statistic = "F"

UPDATE1: Another paper describing lme4

Kliegl, R., Wei, P., Dambacher, M., Yan, M., & Zhou, X. (2011). Experimental effects and individual differences in linear mixed models: estimating the relationship between spatial, object, and attraction effects in visual attention. Frontiers in Quantitative Psychology and Measurement, 1, 238. doi:10.3389/fpsyg.2010.00238

Original Response:

I do not have a number of examples, only one (see below), but know some paper you should cite from Psychology/Cognitive Sciences. The most important one is definitely:

Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412. doi:10.1016/j.jml.2007.12.005

Another one from Baayen is:

Baayen, R. H., & Milin, P. (2010). Analyzing Reaction Times. International Journal of Psychological Research, 3(2), 12–28.

I actually totally liked his book, too, which also has a nice introductory chapter on mixed model (and is pretty cheap for a stats book):
Baayen, R. H. (2008). Analyzing linguistic data : a practical introduction to statistics using R. Cambridge, UK; New York: Cambridge University Press.

I probably guess he also has a lot of papers using lme4, but as my main interest is not psycholinguistics, you might wanna check his homepage.

From my field (reasoning), I know of this one paper that uses lme4:

Fugard, A. J. B., Pfeifer, N., Mayerhofer, B., & Kleiter, G. D. (2011). How people interpret conditionals: Shifts toward the conditional event. Journal of Experimental Psychology: Learning, Memory, and Cognition, 37(3), 635–648. doi:10.1037/a0022329

(although I have the feeling they use a likelihood ratio test to compare models which only differ in the fixed parameters, which I have heard is not the correct way. I think you should use AIC instead.)

Solved – Are degrees of freedom in lmerTest::anova correct? They are very different from RM-ANOVA

I think that lmerTest is getting it right and ezanova is getting it wrong in this case.

the results from lmerTest agree with my intuition/understanding
two different computations in lmerTest (Satterthwaite and Kenward-Roger) agree
they also agree with nlme::lme
when I run it, ezanova gives a warning, which I don't entirely understand, but which should not be disregarded ...

Re-running example:

library(ez); library(lmerTest); library(nlme)
data(ANT)
ANT.2 <- subset(ANT, !error)
set.seed(101)  ## for reproducibility
baseline.shift <- rnorm(length(unique(ANT.2$subnum)), 0, 50)
ANT.2$rt <- ANT.2$rt + baseline.shift[as.numeric(ANT.2$subnum)]

Figure out experimental design

with(ANT.2,table(subnum,group,direction))

So it looks like individuals (subnum) are placed in either control or treatment groups, and each is tested for both directions -- i.e. direction can be tested within individuals (denominator df is large), but group and group:direction can only be tested among individuals

(anova.ez <- ezANOVA(data = ANT.2, dv = .(rt), wid = .(subnum), 
    within = .(direction), between = .(group)))
## $ANOVA
##            Effect DFn DFd         F          p p<.05          ges
## 2           group   1  18 2.4290721 0.13651174       0.1183150147
## 3       direction   1  18 0.9160571 0.35119193       0.0002852171
## 4 group:direction   1  18 4.9169156 0.03970473     * 0.0015289914

Here I get Warning: collapsing data to cell means. *IF* the requested effects are a subset of the full design, you must use the "within_full" argument, else results may be inaccurate. The denominator DF look a little funky (all equal to 18): I think they should be larger for direction and group:direction, which can be tested independently (but would be smaller if you added (direction|subnum) to the model)?

# similarly with lmer and lmerTest::anova
model <- lmer(rt ~ group * direction + (1 | subnum), data = ANT.2)
lmerTest::anova(model)
##                 Df  Sum Sq Mean Sq F value Denom Pr(>F)
## group            1 12065.7 12065.7  2.4310    18 0.1364
## direction        1  1952.2  1952.2  0.3948  5169 0.5298
## group:direction  1 11552.2 11552.2  2.3299  5169 0.1270

the Df column here refers to the numerator df, Denom (second-to-last) gives the estimated denominator df; they agree with the classical intuition. More important, we also get different answers for the F values ...

We can also double-check with Kenward-Roger (very slow because it involves refitting the model several times)

lmerTest::anova(model,ddf="Kenward-Roger")

The results are identical.

For this example lme (from the nlme package) actually does a perfectly good job guessing the appropriate denominator df (the F and p-values are very slightly different):

model3 <- lme(rt ~ group * direction, random=~1|subnum, data = ANT.2)
anova(model3)[-1,]
##                 numDF denDF   F-value p-value
## group               1    18 2.4334314  0.1362
## direction           1  5169 0.3937316  0.5304
## group:direction     1  5169 2.3298847  0.1270

If I fit an interaction between direction and subnum the df for direction and group:direction are much smaller (I would have thought they would be 18, but maybe I'm getting something wrong):

model2 <- lmer(rt ~ group * direction + (direction | subnum), data = ANT.2)
lmerTest::anova(model2)
##                 Df  Sum Sq Mean Sq F value   Denom Pr(>F)
## group            1 20334.7 20334.7  2.4302  17.995 0.1364
## direction        1  1804.3  1804.3  0.3649 124.784 0.5469
## group:direction  1 10616.6 10616.6  2.1418 124.784 0.1459

Best Answer

Related Solutions

Solved – Example reports for mixed-model analysis using lmer in biology, psychology and medicine

Solved – Are degrees of freedom in lmerTest::anova correct? They are very different from RM-ANOVA

Related Question