The equivalence of the models can be observed by calculating the correlation between two observations from the same individual, as follows:
As in your notation, let $Y_{ij} = \mu + \alpha_i + \beta_j + \epsilon_{ij}$, where
$\beta_j \sim N(0, \sigma_p^2)$ and $\epsilon_{ij} \sim N(0, \sigma^2)$.
Then
$Cov(y_{ik}, y_{jk}) = Cov(\mu + \alpha_i + \beta_k + \epsilon_{ik}, \mu + \alpha_j + \beta_k + \epsilon_{jk}) = Cov(\beta_k, \beta_k) = \sigma_p^2$, because all other terms are independent or fixed,
and $Var(y_{ik}) = Var(y_{jk}) = \sigma_p^2 + \sigma^2$,
so the correlation is $\sigma_p^2/(\sigma_p^2 + \sigma^2)$.
Note that the models however are not quite equivalent as the random effect model forces the correlation to be positive. The CS model and the t-test/anova model do not.
EDIT: There are two other differences as well. First, the CS and random effect models assume normality for the random effect, but the t-test/anova model does not. Secondly, the CS and random effect models are fit using maximum likelihood, while the anova is fit using mean squares; when everything is balanced they will agree, but not necessarily in more complex situations. Finally, I'd be wary of using F/df/p values from the various fits as measures of how much the models agree; see Doug Bates's famous screed on df's for more details. (END EDIT)
The problem with your R
code is that you're not specifying the correlation structure properly. You need to use gls
with the corCompSymm
correlation structure.
Generate data so that there is a subject effect:
set.seed(5)
x <- rnorm(10)
x1<-x+rnorm(10)
x2<-x+1 + rnorm(10)
myDat <- data.frame(c(x1,x2), c(rep("x1", 10), rep("x2", 10)),
rep(paste("S", seq(1,10), sep=""), 2))
names(myDat) <- c("y", "x", "subj")
Then here's how you'd fit the random effects and the compound symmetry models.
library(nlme)
fm1 <- lme(y ~ x, random=~1|subj, data=myDat)
fm2 <- gls(y ~ x, correlation=corCompSymm(form=~1|subj), data=myDat)
The standard errors from the random effects model are:
m1.varp <- 0.5453527^2
m1.vare <- 1.084408^2
And the correlation and residual variance from the CS model is:
m2.rho <- 0.2018595
m2.var <- 1.213816^2
And they're equal to what is expected:
> m1.varp/(m1.varp+m1.vare)
[1] 0.2018594
> sqrt(m1.varp + m1.vare)
[1] 1.213816
Other correlation structures are usually not fit with random effects but simply by specifying the desired structure; one common exception is the AR(1) + random effect model, which has a random effect and AR(1) correlation between observations on the same random effect.
EDIT2: When I fit the three options, I get exactly the same results except that gls doesn't try to guess the df for the term of interest.
> summary(fm1)
...
Fixed effects: y ~ x
Value Std.Error DF t-value p-value
(Intercept) -0.5611156 0.3838423 9 -1.461839 0.1778
xx2 2.0772757 0.4849618 9 4.283380 0.0020
> summary(fm2)
...
Value Std.Error t-value p-value
(Intercept) -0.5611156 0.3838423 -1.461839 0.1610
xx2 2.0772757 0.4849618 4.283380 0.0004
> m1 <- lm(y~ x + subj, data=myDat)
> summary(m1)
...
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.3154 0.8042 -0.392 0.70403
xx2 2.0773 0.4850 4.283 0.00204 **
(The intercept is different here because with the default coding, it's not the mean of all subjects but instead the mean of the first subject.)
It's also of interest to note that the newer lme4
package gives the same results but doesn't even try to compute a p-value.
> mm1 <- lmer(y ~ x + (1|subj), data=myDat)
> summary(mm1)
...
Estimate Std. Error t value
(Intercept) -0.5611 0.3838 -1.462
xx2 2.0773 0.4850 4.283
Best Answer
They are the fitted values at different levels of grouping. For example, if you fit
you can get the predicted overall outcome by
and it is 2000.153. The predicted outcomes for each Lot are:
and there are eight outocomes as there are eight Lots. The predicted outcomes for each Wafer, which are nested within Lots, are:
If you run
you get the predicted outcomes for each level of grouping, i.e. fmOxide$fixed.