Solved – Repeated measures ANOVA: what is the normality assumption

anovaassumptionsnormality-assumptionrepeated measures

I am confused about the normality assumption in repeated measures ANOVA. Specifically, I am wondering what kind of normality exactly should be satisfied. In reading the literature and the answers on CV, I came across three distinct wordings of this assumption.

  1. Dependent variable within each (repeated) condition should be distributed normally.

    It is often stated that rANOVA has the same assumptions as ANOVA, plus the sphericity. That is the claim in Field's Discovering statistics as well as in Wikipedia's article on the subject and Lowry's text.

  2. The residuals (differences between all possible pairs?) should be distributed normally.

    I found this statement in multiple answers on CV (1, 2). By analogy of rANOVA to the paired t-test, this might also seem intuitive.

  3. Multivariate normality should be satisfied.

    Wikipedia and this source mention this. Also, I know that rANOVA can be swapped with MANOVA, which might merit this claim.

Are these equivalent somehow? I know that multivariate normality means that any linear combination of the DVs is normally distributed, so 3. would naturally include 2. if I understand the latter correctly.

If these are not the same, which is the "true" assumption of the rANOVA? Can you provide a reference?

It seems to me there is most support for the first claim. This is not in line, however, with the answers usually provided here.


Linear mixed models

Due to @utobi's hint, I now understand how rANOVA can be restated as a linear mixed model. Specifically, to model how blood pressure changes with time, I would model the expected value as:
$$
\mathrm{E}\left[y_{ij}\right]=a_{i}+b_i t_{ij},
$$
where $y_{ij}$ are measurements of blood pressure, $a_{i}$ the average blood pressure of the $i$-th subject, and $t_{ij}$ as the $j$-th time the $i$-th subject was measured, $b_i$ denoting that the change in blood pressure is different across subject, too. Both effects are considered random, since the sample of subjects is only a random subset of the population, which is of primary interest.

Finally, I tried to think about what this means for normality, but to little success. To paraphrase McCulloch and Searle (2001, p. 35. Eq. (2.14)):

\begin{align}
\mathrm{E}\left[y_{ij}|a_i\right] &= a_i \\[5pt]
y_{ij}|a_i &\sim \mathrm{indep.}\ \mathcal{N}(a_i,\sigma^2) \\[5pt]
a_i &\sim \mathrm{i.i.d.}\ \mathcal{N}(a,\sigma_a^2)
\end{align}

I understand this to mean that

4. each individual's data needs to be normally distributed, but this is unreasonable to test with few time points.

I take the third expression to mean that

5. averages of individual subjects are normally distributed. Note that these are another two distinct possibilities on top of the three mentioned above.


McCulloch, C. E. & Searle, S. R. (2001). Generalized, Linear, and Mixed models. New York: John Wiley & Sons, Inc.

Best Answer

This is the simplest repeated measures ANOVA model if we treat it as a univariate model:

$$y_{it} = a_{i} + b_{t} + \epsilon_{it}$$

where $i$ represents each case and $t$ the times we measured them (so the data are in long form). $y_{it}$ represents the outcomes stacked one on top of the other, $a_{i}$ represents the mean of each case, $b_{t}$ represents the mean of each time point and $\epsilon_{it}$ represents the deviations of the individual measurements from the case and time point means. You can include additional between-factors as predictors in this setup.

We do not need to make distributional assumptions about $a_{i}$, as they can go into the model as fixed effects, dummy variables (contrary to what we do with linear mixed models). Same happens for the time dummies. For this model, you simply regress the outcome in long form against the person dummies and the time dummies. The effect of interest is the time dummies, the $F$-test that tests the null hypothesis that $b_{1}=...=b_{t}=0$ is the major test in the univariate repeated measures ANOVA.

What are the required assumptions for the $F$-test to behave appropriately? The one relevant to your question is:

\begin{equation} \epsilon_{it}\sim\mathcal{N}(0,\sigma)\quad\text{these errors are normally distributed and homoskedastic} \end{equation}

There are additional (more consequential) assumptions for the $F$-test to be valid, as one can see that the data are not independent of each other since the individuals repeat across rows.

If you want to treat the repeated measures ANOVA as a multivariate model, the normality assumptions may be different, and I cannot expand on them beyond what you and I have seen on Wikipedia.

Related Question