Repeated measures is kind of an overloaded term. To some people it refers to a particular statistical analysis method; to others it refers to the structure of the design.
This is a variant on a three period, three treatment crossover design.
It is a variant because usually in a crossover design you randomize subjects to sequences. In this case the sequence is determined randomly for each subject. Since there are six possible sequences, it might be that some sequences are not observed, especially with 10 subjects. Maybe this is formally the same as randomizing subjects to sequences, but I haven't looked at that yet.
The considerations for crossover designs are:
Carryover effects: Also known as residual effects, where prior treatment may affect response to current treatment. The goal of the washout periods is to remove this from consideration. You could also have (in theory) second-order residual effects, where the treatment given in the first period potentially affects the response to treatment given in the third period.
Period effects: Response to treatment(s) may change as the study goes on for a given subject.
Autocorrelation: Serial correlation in errors is usually an issue with more closely measured data. In simple balanced designs, having a random effect for subject is going to imply equal correlaation of errors from each subject.
Subject effects: Subjects may differ in mean response from each other regardless of treatments. You could conceive of a situation where measurement error was serially correlated separate from a random subject effect.
Sequence effect: In cases where you randomize subjects to sequences, subjects are considered nested in sequence.
A minimal analysis for this would be the suggested randomized complete block design. That is, a fixed effect for treatment and a random effect for subject. With a skimpy sample size that might be all you can really do.
I would argue for a bit more structure to the analysis, if possible. Assuming no carryover effects on scientific grounds, it seems like a good idea to have at fixed effects for treatment, period, and treatment $\times$ period interaction, and a random effect for subjects. For small data sets, if this model can't be fit, I would drop the treatment $\times$ period interaction first.
Period should be included because it represents a restriction on the randomization. You cannot "randomize" periods --- they always happen in the same order. Treatment $\times$ period interaction might be indicative of some sort of carryover effect.
With tons of data one could work up terms that would allow estimation of various specific carryover effects. My notes on this are gone, though I know I've seen it handled in some texts.
The strategy of additionally modelling the correlation structure on the R-side seems reasonable to me. That allows one to claim that one is handling the possible dependence structure induced by repeated measures on the same subject, which I would also probably claim about the random effect for subject if the analysis devolved to that level... It is also nice if various analysis strategies provide broadly or very similar results.
For implementation, I'd use PROC MIXED
in SAS
and likely nlme
or lme4
in R
.
I'll punt on the compound symmetry question, since that seems more like a holdover from the days where MANOVA was the only "correct" analysis for repeated measures.
The equivalence of the models can be observed by calculating the correlation between two observations from the same individual, as follows:
As in your notation, let $Y_{ij} = \mu + \alpha_i + \beta_j + \epsilon_{ij}$, where
$\beta_j \sim N(0, \sigma_p^2)$ and $\epsilon_{ij} \sim N(0, \sigma^2)$.
Then
$Cov(y_{ik}, y_{jk}) = Cov(\mu + \alpha_i + \beta_k + \epsilon_{ik}, \mu + \alpha_j + \beta_k + \epsilon_{jk}) = Cov(\beta_k, \beta_k) = \sigma_p^2$, because all other terms are independent or fixed,
and $Var(y_{ik}) = Var(y_{jk}) = \sigma_p^2 + \sigma^2$,
so the correlation is $\sigma_p^2/(\sigma_p^2 + \sigma^2)$.
Note that the models however are not quite equivalent as the random effect model forces the correlation to be positive. The CS model and the t-test/anova model do not.
EDIT: There are two other differences as well. First, the CS and random effect models assume normality for the random effect, but the t-test/anova model does not. Secondly, the CS and random effect models are fit using maximum likelihood, while the anova is fit using mean squares; when everything is balanced they will agree, but not necessarily in more complex situations. Finally, I'd be wary of using F/df/p values from the various fits as measures of how much the models agree; see Doug Bates's famous screed on df's for more details. (END EDIT)
The problem with your R
code is that you're not specifying the correlation structure properly. You need to use gls
with the corCompSymm
correlation structure.
Generate data so that there is a subject effect:
set.seed(5)
x <- rnorm(10)
x1<-x+rnorm(10)
x2<-x+1 + rnorm(10)
myDat <- data.frame(c(x1,x2), c(rep("x1", 10), rep("x2", 10)),
rep(paste("S", seq(1,10), sep=""), 2))
names(myDat) <- c("y", "x", "subj")
Then here's how you'd fit the random effects and the compound symmetry models.
library(nlme)
fm1 <- lme(y ~ x, random=~1|subj, data=myDat)
fm2 <- gls(y ~ x, correlation=corCompSymm(form=~1|subj), data=myDat)
The standard errors from the random effects model are:
m1.varp <- 0.5453527^2
m1.vare <- 1.084408^2
And the correlation and residual variance from the CS model is:
m2.rho <- 0.2018595
m2.var <- 1.213816^2
And they're equal to what is expected:
> m1.varp/(m1.varp+m1.vare)
[1] 0.2018594
> sqrt(m1.varp + m1.vare)
[1] 1.213816
Other correlation structures are usually not fit with random effects but simply by specifying the desired structure; one common exception is the AR(1) + random effect model, which has a random effect and AR(1) correlation between observations on the same random effect.
EDIT2: When I fit the three options, I get exactly the same results except that gls doesn't try to guess the df for the term of interest.
> summary(fm1)
...
Fixed effects: y ~ x
Value Std.Error DF t-value p-value
(Intercept) -0.5611156 0.3838423 9 -1.461839 0.1778
xx2 2.0772757 0.4849618 9 4.283380 0.0020
> summary(fm2)
...
Value Std.Error t-value p-value
(Intercept) -0.5611156 0.3838423 -1.461839 0.1610
xx2 2.0772757 0.4849618 4.283380 0.0004
> m1 <- lm(y~ x + subj, data=myDat)
> summary(m1)
...
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.3154 0.8042 -0.392 0.70403
xx2 2.0773 0.4850 4.283 0.00204 **
(The intercept is different here because with the default coding, it's not the mean of all subjects but instead the mean of the first subject.)
It's also of interest to note that the newer lme4
package gives the same results but doesn't even try to compute a p-value.
> mm1 <- lmer(y ~ x + (1|subj), data=myDat)
> summary(mm1)
...
Estimate Std. Error t value
(Intercept) -0.5611 0.3838 -1.462
xx2 2.0773 0.4850 4.283
Best Answer
Compound symmetry is essentially the "exchangeable" correlation structure, except with a specific decomposition for the total variance. For example, if you have mixed model for the subject $i$ in cluster $j$ response, $Y_{ij}$, with only a random intercept by cluster
$$ Y_{ij} = \alpha + \gamma_{j} + \varepsilon_{ij} $$
where $\gamma_{j}$ is the cluster $j$ random effect with variance $\sigma^{2}_{\gamma}$ and $\varepsilon_{ij}$ is the subject $i$ in cluster $j$ "measurement error" with variance $\sigma^{2}_{\varepsilon}$ and $\gamma_{j}, \varepsilon_{ij}$ are independent. This model implicitly specifies the compound symmetry covariance matrix between observations in the same cluster:
$$ {\rm cov}(Y_{ij}, Y_{kj}) = \sigma^{2}_{\gamma} + \sigma^{2}_{\varepsilon} \cdot \mathcal{I}(k = i) $$
Note that the compound symmetry assumption implies that the correlation between distinct members of a cluster is $\sigma^{2}_{\gamma}/(\sigma^{2}_{\gamma} + \sigma^{2}_{\varepsilon})$.
In "plain english" you might say this covariance structure implies that all distinct members of a cluster are equally correlated with each other and the total variation, $\sigma^{2} = \sigma^{2}_{\gamma} + \sigma^{2}_{\varepsilon}$, can be partitioned into the "shared" (within a cluster) component, $\sigma^{2}_{\gamma}$ and the "unshared" component, $\sigma^{2}_{\varepsilon}$.
Edit: To aid understanding in the "plain english" sense, consider an example where individuals are clustered within families so that $Y_{ij}$ denotes the subject $i$ in family $j$ response. In this case the compound symmetry assumption means that the total variation in $Y_{ij}$ can be partitioned into the variation within a family, $\sigma^{2}_{\varepsilon}$, and the variation between families, $\sigma^{2}_{\gamma}$.