Solved – ezAnova vs. lme for factorial repeated-measures design: results differ, why

anovalme4-nlmerrepeated measures

I have used lme and ezAnova to analyse data from a 2$\times$3 repeated-measures experiment. Theoretically those are two different ways to perform the same analysis. However, the resulting $F$-statisics and DF differ and I am lost in why.

Here is the exact data and output:
I have a data set with 2 independent variables (marker_lang and congruency) and the dependent variable RT: Both IV are repeated and completely crossed (thus 6 conditions overall). The data are not collapsed to cell means, meaning that per condition and subject I have several data points.

Here is what I did with ezAnova:

ezANOVA(subset(data.mark.afc, !is.na(afc.RT)), dv=afc.RT, wid=subjectID, 
within=.(marker_lang,congruency), within_full=.(marker_lang,congruency), detailed=1, type=3)

And the output:

$Anova
              Effect DFn DFd          SSn        SSd            F            p   p<.05          ges
1            (Intercept)   1  24 84879098.819 1892110.06 1076.6278430 1.881814e-21     * 0.9762134497
2            marker_lang   1  24    36392.804   80595.30   10.8371986 3.071336e-03     * 0.0172922873
3             congruency   2  48    25426.393   47319.45   12.8960382 3.292730e-05     * 0.0121448066
4 marker_lang:congruency   2  48     1160.152   48150.91    0.5782581 5.647333e-01       0.0005606399

Here is what I did with lme:

basemodel <- lme(data=subset(cdata, !is.na(afc.RT)), afc.RT~1,
random=~1|subjectID/congruency/marker_lang, method="ML")

langmodel <- update(basemodel, .~. + marker_lang)

angcongmodel <- update(langmodel, .~. + congruency)

fulmodel <- update(langcongmodel, .~. +marker_lang:congruency)

And the anova-tables for the lme analysis:

anova(fulmodel)
                   numDF denDF   F-value p-value
(Intercept)                1  4715 1112.3468  <.0001
marker_lang                1    72   24.8917  <.0001
congruency                 2    48    8.3902  0.0008
marker_lang:congruency     2    72    0.4475  0.6410

anova(basemodel, langmodel, langcongmodel, fulmodel)
          Model df      AIC      BIC    logLik   Test   L.Ratio p-value
basemodel         1  5 64203.43 64235.88 -32096.72                         
langmodel         2  6 64185.06 64224.00 -32086.53 1 vs 2 20.366313  <.0001
langcongmodel     3  8 64173.27 64225.19 -32078.64 2 vs 3 15.790082  0.0004
fulmodel          4 10 64176.38 64241.28 -32078.19 3 vs 4  0.892535  0.6400

I would expect the $F$, DF, and $p$-values for corresponding effects to be the same, which is not the case. This seems not an issue of different anova-types, as I tried out different types for ezAnova. None yield the same result as anova of fulmodel.

Any help/ideas will be greatly appreciated!

Best Answer

It looks like with ezANOVA you just did a classical Type III SS ANOVA (which uses the Anova command from the car package). With lme you're doing an ANOVA analysis of a linear mixed effects model. They're not supposed to come out the same. ezANOVA can also do the latter (using lmer) but that isn't what you've asked for.

You should be aggregating your data set in order to do your ordinary ANOVA and NOT doing so for lme. It's hard to tell if you've done that. You don't get to use multiple measures per condition per subject in ordinary ANOVA. This is a main reason for your degrees of freedom difference between the first two analyses. They are calculated very differently but you should get very similar results for the first two analyses if the data are aggregated. However, that's not the right way to do the lme and not a goal you should be striving toward.

Your last analysis doesn't even have F values in it (L. Ratio is a chi-square). How were you imagining the Fs could be the same? It's comparing the likelihood of the models and is different yet again.

(note: Also there are two ezANOVA programs. One is written by Chris Rorden and runs as a stand alone data analysis program and the other is from the ez package in R written by Mike Lawrence. For anyone doing a web search this is clearly the latter given the R command listed here.)