Solved – Split-Plot ANOVA: model comparison tests in R

anovamultivariate analysisrrepeated measuressplit-plot

How can I test effects in a Split-Plot ANOVA using suitable model comparisons for use with the X and M arguments of anova.mlm() in R? I'm familiar with ?anova.mlm and Dalgaard (2007)[1]. Unfortunately it only brushes Split-Plot Designs. Doing this in a fully randomized design with two within-subjects factors:

N  <- 20  # 20 subjects total
P  <- 3   # levels within-factor 1
Q  <- 3   # levels within-factor 2
DV <- matrix(rnorm(N* P*Q), ncol=P*Q)           # random data in wide format
id <- expand.grid(IVw1=gl(P, 1), IVw2=gl(Q, 1)) # intra-subjects layout of data matrix

library(car)        # for Anova()
fitA <- lm(DV ~ 1)  # between-subjects design: here no between factor
resA <- Anova(fitA, idata=id, idesign=~IVw1*IVw2)
summary(resA, multivariate=FALSE, univariate=TRUE)  # all tests ...

The following model comparisons lead to the same results. The restricted model doesn't include the effect in question but all other effects of the same order or lower, the full model adds the effect in question.

anova(fitA, idata=id, M=~IVw1 + IVw2, X=~IVw2, test="Spherical") # IVw1
anova(fitA, idata=id, M=~IVw1 + IVw2, X=~IVw1, test="Spherical") # IVw2
anova(fitA, idata=id, M=~IVw1 + IVw2 + IVw1:IVw2,
                      X=~IVw1 + IVw2, test="Spherical")          # IVw1:IVw2

A Split-Splot design with one within and one between-subjects factor:

idB  <- subset(id, IVw2==1, select="IVw1")          # use only first within factor
IVb  <- gl(2, 10, labels=c("A", "B"))               # between-subjects factor
fitB <- lm(DV[ , 1:P] ~ IVb)                        # between-subjects design
resB <- Anova(fitB, idata=idB, idesign=~IVw1)
summary(resB, multivariate=FALSE, univariate=TRUE)  # all tests ...

These are the anova() commands to replicate the tests, but I don't know why they work. Why do the tests of the following model comparisons lead to the same results?

anova(fitB, idata=idB, X=~1, test="Spherical") # IVw1, IVw1:IVb
anova(fitB, idata=idB, M=~1, test="Spherical") # IVb

Two within-subjects factors and one between-subjects factor:

fitC <- lm(DV ~ IVb)  # between-subjects design
resC <- Anova(fitC, idata=id, idesign=~IVw1*IVw2)
summary(resC, multivariate=FALSE, univariate=TRUE)  # all tests ...

How do I replicate the results given above with the corresponding model comparisons for use with the X and M arguments of anova.mlm()? What is the logic behind these model comparisons?

EDIT: suncoolsu pointed out that for all practical purposes, data from these designs should be analyzed using mixed models. However, I'd still like to understand how to replicate the results of summary(Anova()) with anova.mlm(..., X=?, M=?).

[1]: Dalgaard, P. 2007. New Functions for Multivariate Analysis. R News, 7(2), 2-7.

Best Answer

The X and M basically specify the two models you want to compare, but only in terms of the within-subject effects; it then shows results for the interaction of all between-subject effects (including the intercept) with the within-subject effects that changed between X and M.

Your examples on fitB are easier to understand if we add the defaults for X and M:

anova(fitB, idata=idB, M=~1, X=~0, test="Spherical") # IVb
anova(fitB, idata=idB, M=diag(3), X=~1, test="Spherical") # IVw1, IVw1:IVb

The first model is the change from no within subject effects (all have the same mean) to a different mean for each, so we've added the id random effect, which is the right thing to test the overall intercept and the overall between subject effect on.

The second model ads the id:IVw1 interaction, which is the right thing to test IVw1 and the IVw1:IVb terms against. Since there is only one within-subject effect (with three levels) the default of diag(3) in the second model will account for it; it would be equivalent to run

anova(fitB, idata=idB, M=~IVw1, X=~1, test="Spherical") # IVw1, IVw1:IVb

For your fitC, I believe these commands will recreate the Anova summary.

anova(fitC, idata=id, M=~1, X=~0, test="Spherical") #IVb
anova(fitC, idata=id, M=~IVw1 + IVw2, X=~IVw2, test="Spherical") # IVw1
anova(fitC, idata=id, M=~IVw1 + IVw2, X=~IVw1, test="Spherical") # IVw2
anova(fitC, idata=id, M=~IVw1 + IVw2 + IVw1:IVw2,
                  X=~IVw1 + IVw2, test="Spherical")          # IVw1:IVw2

Now, as you discovered, these commands are really tricky. Thankfully, there's not much reason to use them anymore. If you're willing to assume sphericity, you should just use aov, or for even easier syntax, just use lm and compute the right F-tests yourself. If you're not willing to assume sphericity, using lme is really the way to go as you get a lot more flexibility than you do with the GG and HF corrections.

For example, here's the aov and lm code for your fitA. You do need to have the data in long format first; here's one way to do that:

library(reshape)
d0 <- data.frame(id=1:nrow(DV), DV)
d0$IVb <- IVb
d0 <- melt(d0, id.vars=c(1,11), measure.vars=2:10)
id0 <- id
id0$variable <- factor(levels(d0$variable), levels=levels(d0$variable))
d <- merge(d0, id0)
d$id <- factor(d$id)

And here's lm andaov` code:

anova(lm(value ~ IVw1*IVw2*id, data=d))
summary(aov(value ~ IVw1*IVw2 + Error(id/(IVw1*IVw2)), data=d))