Solved – Plotting to check Homoskedasticity assumption for repeated-measures ANOVA in R

anovaassumptionsdata visualizationrrepeated measures

I've run a fully within-subjects repeated-measures ANOVA using the aov() function. My dependent variable is not normally distributed, so I'm very interested in running assumption tests on my analysis. It seems that just calling plot() on the output doesn't work for repeated-measures, so I've manually taken the residuals and the fitted values for a model of interest, and have plotted them against each other. I'm assuming that this is how I would plot to test for the assumption of Homoskedasticity.

The plot comes out with 2 vertical bands (please see the image below). It turns out the fitted values are all centred around 2 values (although according to == they are not exactly equal), where one is the negative of the other.

I have 2 questions:

1) Is this the correct way to manually test the assumption homoskedasticity? If not, how would I go about it from repeated-measures designs (since just calling plot() doesn't work)?

2) If it is correct, what is this plot telling me? Why are the fitted values so clustered? What can I conclude from it?

Thanks heaps for any input here. Also, if you know of better ways to check (preferably plot) for assumptions in rm-ANOVAs, that would be useful information as well.

I've included some mock data here to replicate the scenario:

#Create mock data (there's probably a more efficient way to do this.. would also be nice to know! :) )
p <- sort(rep(1:20,8))
y <- rep(rep(1:2,4),20)
z <- rep(rep(c(1,1,2,2),2),20)
w <- rep(c(1,1,1,1,2,2,2,2),20)
x <- rnorm(160,10,2)

d <- data.frame(x,p=factor(p),y=factor(y),z=factor(z),w=factor(w))

#Run repeated-measures ANOVA
ex.aov <- aov(x ~ y*z*w + Error(p/(y*z*w)), d)

#Try to plot full object (doesn't work)
plot(ex.aov)

#Try to plot section of object (doesn't work)
plot(ex.aov[["p:y:z"]])

#Plot residuals against fitted (custom "skedasticity" plot - works)
plot(residuals(ex.aov[["p:y:z"]])~fitted(ex.aov[["p:y:z"]]))

Begin Edit

In light of the information provided by @Stefan , I've added some additional details below, using the improved data structure he proposed:

# Set seed to make it reproducible
set.seed(12)

#New variable names and generation
subj <- sort(factor(rep(1:20,8)))
x1 <- rep(c('A','B'),80)
x2 <- rep(c('A','B'),20,each=2)
x3 <- rep(c('A','B'),10, each=4)
outcome <- rnorm(80,10,2)

d3 <- data.frame(outcome,subj,x1,x2,x3)

#Repeated measures ANOVA
ex.aov <- aov(outcome ~ x1*x2*x3 + Error(subj/(x1*x2*x3)), d3)

#proj function
ex.aov.proj <- proj(ex.aov)

# Check for normality by using last error stratum
qqnorm(ex.aov.proj[[9]][, "Residuals"])
# Check for heteroscedasticity by using last error stratum
plot(ex.aov.proj[[9]][, "Residuals"])

The resulting plots are below:

Can anyone interpret the images above (especially the last one)? It looks like there is clustering and pattern structure. Can it be used to infer the presence of heteroskedasticity?

Best Answer

I'm assuming that a model which was fitted using the Error() function within aov() won't work when using in plot() because you will get more than one error stratum from which you can choose. Now according to this information here, one should use the proj() function which will give you the residuals for each error stratum, which then can be used for diagnostic plots.

Edit 1 start

More information regarding multistratum models and the proj() function is given in Venables and Ripley, page 284 (but start from page 281): Residuals in multistratum analyses: Projections. In the second sentence they write (I highlighted in bold):

Thus fitted(oats.aov[[4]]) and resid(oats.aov[[4]]) are vectors of length 54 representing fitted values and residuals from the last stratum, based on 54 orthonormal linear functions of the original data vector. It is not possible to associate them uniquely with the plots of the original experiment. The function proj takes a fitted model object and finds the projections of the original data vector onto the subspaces defined by each line in the analysis of variance tables (including, for multistratum objects, the suppressed table with the grand mean only). The result is a list of matrices, one for each stratum, where the column names for each are the component names from the analysis of variance tables.

For your example that means:

ex.aov.proj <- proj(ex.aov)

# Check number of strata 
summary(ex.aov.proj)

# Check for normality by using last error stratum
qqnorm(ex.aov.proj[[9]][, "Residuals"])
# Check for heteroscedasticity by using last error stratum
plot(ex.aov.proj[[9]][, "Residuals"])

However, this will also lead into plots which I cannot fully interpret (especially the second one).

In their case, the last stratum was the Within stratum. Since your model cannot estimate this (presumably due to your error term), I am not sure if simply using your last stratum is valid.

Hopefully someone else can clarify.

Edit 1 end

Edit 2 start

According to this source checking residuals to assess normality and heteroscedasticity should be performed without the Error() function.

In order to check assumptions, you need to not use the error term. You can add the term without error, but the F tests are wrong. Assumption checking is OK, however.

This seems reasonable to me but I hope someone else could clarify.

Edit 2 end

My alternative suggestion:

First, I changed your dataset slightly and set a seed to make it reproducible (might be handy for some problems you have in the future):

# Set seed to make it reproducible
set.seed(12)

# I changed the names of your variables to make them easier to remember
# I also deleted a few nested `rep()` commands. Have a look at the `each=` argument.
subj <- sort(factor(rep(1:20,8)))
x1 <- rep(c('A','B'),80)
x2 <- rep(c('A','B'),20,each=2)
x3 <- rep(c('A','B'),10, each=4)
outcome <- rnorm(80,10,2)

d3 <- data.frame(outcome,subj,x1,x2,x3)

Second, I used a linear mixed-effects model instead since you have repeated measures and hence a random term you can use:

require(lme4)
# I specified `subj` as random term to account for the repeated measurements on subject.
m.lmer<-lmer(outcome ~ x1*x2*x3 + (1|subj), data = d3)
summary(m.lmer)

# Check for heteroscedasticity
plot(m.lmer)

# or
boxplot(residuals(m.lmer) ~ d3$x1 + d3$x2 + d3$x3)

# Check for normality
qqnorm(residuals(m.lmer))

Using the afex package you can also get the fixed effects in ANOVA table format (you can also use the Anova() function from the car package as another option):

require(afex)
mixed(outcome ~ x1*x2*x3 + (1|subj), data = d3, method="LRT")

Fitting 8 (g)lmer() models:
[........]
    Effect df    Chisq p.value
1       x1  1     0.04     .84
2       x2  1     2.53     .11
3       x3  1  7.68 **    .006
4    x1:x2  1  8.34 **    .004
5    x1:x3  1 10.51 **    .001
6    x2:x3  1     0.31     .58
7 x1:x2:x3  1     0.12     .73

Check ?mixed for the various options you can choose. Also regarding mixed models, there is a lot of information here on Cross Validated.

Related Solutions

Solved – Testing the normality assumption for repeated measures anova? (in R)

You may not get a simple response to residuals(npk.aovE) but that does not mean there are no residuals in that object. Do str and see that within the levels there are still residuals. I would imagine you were most interested in the "Within" level

> residuals(npk.aovE$Within)
          7           8           9          10          11          12 
 4.68058815  2.84725482  1.56432584 -5.46900749 -1.16900749 -3.90234083 
         13          14          15          16          17          18 
 5.08903669  1.28903669  0.35570336 -3.27762998 -4.19422371  1.80577629 
         19          20          21          22          23          24 
-3.12755705  0.03910962  2.60396981  1.13730314  2.77063648  4.63730314

My own training and practice has not been to use normality testing, instead to use QQ plots and parallel testing with robust methods.

Solved – How is an ANOVA calculated for a repeated measures design: aov() vs lm() in R

One way to think about it is to treat the situation as a 3-factorial between subjects ANOVA with IVs participant, factor1, factor2, and a cell size of 1. anova(lm(value ~ factor1*factor2*participant, DFlong)) calculates all the SS for all effects in this 3-way ANOVA (3 main effects, 3 first-order interactions, 1 second-order interaction). Since there's only 1 person in each cell, the full model has no errors, and the above call to anova() cannot compute F-tests. But the SS are the same as in the 2-factorial within design.

How does anova() actually compute the SS for an effect? Through sequential model comparisons (type I): It fits a restricted model without the effect in question, and an unrestricted model which includes that effect. The SS associated with this effect is the difference in error SS between both models.

# get all SS from the 3-way between subjects ANOVA
anova(lm(value ~ factor1*factor2*participant, DFlong))

dfL <- DFlong   # just a shorter name for your data frame
names(dfL) <- c("id", "group", "DV", "IV1", "IV2")   # shorter variable names

# sequential model comparisons (type I SS), restricted model is first, then unrestricted
# main effects first
anova(lm(DV ~ 1,      dfL), lm(DV ~ id,         dfL))  # SS for factor id
anova(lm(DV ~ id,     dfL), lm(DV ~ id+IV1,     dfL))  # SS for factor IV1
anova(lm(DV ~ id+IV1, dfL), lm(DV ~ id+IV1+IV2, dfL))  # SS for factor IV2

# now first order interactions
anova(lm(DV ~ id+IV1+IV2, dfL), lm(DV ~ id+IV1+IV2+id:IV1,  dfL))  # SS for id:IV1
anova(lm(DV ~ id+IV1+IV2, dfL), lm(DV ~ id+IV1+IV2+id:IV2,  dfL))  # SS for id:IV2
anova(lm(DV ~ id+IV1+IV2, dfL), lm(DV ~ id+IV1+IV2+IV1:IV2, dfL))  # SS for IV1:IV2

# finally the second-order interaction id:IV1:IV2
anova(lm(DV ~ id+IV1+IV2+id:IV1+id:IV2+IV1:IV2,            dfL),
      lm(DV ~ id+IV1+IV2+id:IV1+id:IV2+IV1:IV2+id:IV1:IV2, dfL))

Now let's check the effect SS associated with the interaction id:IV1 by subtracting the error SS of the unrestricted model from the error SS of the restricted model.

sum(residuals(lm(DV ~ id+IV1+IV2,        dfL))^2) -
sum(residuals(lm(DV ~ id+IV1+IV2+id:IV1, dfL))^2)

Now that you have all the "raw" effect SS, you can build the within-subjects tests simply by choosing the correct error term to test an effect SS against. E.g., test the effect SS for factor1 against the interaction effect SS of participant:factor1.

For an excellent introduction to the model comparison approach, I recommend Maxwell & Delaney (2004). Designing Experiments and Analyzing Data.

Best Answer

Related Solutions

Solved – Testing the normality assumption for repeated measures anova? (in R)

Solved – How is an ANOVA calculated for a repeated measures design: aov() vs lm() in R

Related Question