Mixed Model – A Comprehensive Explanation of Linear Mixed Effects Models

mixed model

I have commonly heard that LME models are more sound in the analysis of accuracy data (i.e., in psychology experiments), in that they can work with binomial and other non-normal distributions that traditional approaches (e.g., ANOVA) can't.

What is the mathematical basis of LME models that allow them to incorporate these other distributions, and what are some not-overly-technical papers describing this?

Best Answer

One major benefit of mixed-effects models is that they don't assume independence amongst observations, and there can be a correlated observations within a unit or cluster.

This is covered concisely in "Modern Applied Statistics with S" (MASS) in the first section of chapter 10 on "Random and Mixed Effects". V&R walk through an example with gasoline data comparing ANOVA and lme in that section, so it's a good overview. The R function to be used in lme in the nlme package.

The model formulation is based on Laird and Ware (1982), so you can refer to that as a primary source although it's certainly not good for an introduction.

Laird, N.M. and Ware, J.H. (1982) "Random-Effects Models for Longitudinal Data", Biometrics, 38, 963–974.
Venables, W.N. and Ripley, B.D. (2002) "Modern Applied Statistics with S", 4th Edition, Springer-Verlag.

You can also have a look at the "Linear Mixed Models" (PDF) appendix to John Fox's "An R and S-PLUS Companion to Applied Regression". And this lecture by Roger Levy (PDF) discusses mixed effects models w.r.t. a multivariate normal distribution.

Related Solutions

Solved – Mixed effects log-linear models

Log-linear or Poisson model are part of generalized linear models. Look at the lme4 package which allows for mixed-effects modeling, with family=poisson().

Here is an example of use:

> data(homerun, package="Zelig")
> with(homerun, table(homeruns, month))
        month
homeruns April August July June March May September
       0    36     36   40   26     1  33        30
       1    13     17   11   21     1  14        14
       2     0      3    3    3     0   3         6
       3     1      0    0    1     0   1         0
> library(lme4)
> mod <- glmer(homeruns ~ player + (player - 1 | month), data=homerun, family=poisson())
> summary(mod)
Generalized linear mixed model fit by the Laplace approximation 
Formula: homeruns ~ player + (player - 1 | month) 
   Data: homerun 
   AIC   BIC logLik deviance
 305.8 324.6 -147.9    295.8
Random effects:
 Groups Name          Variance   Std.Dev.   Corr  
 month  playerMcGwire 7.9688e-10 2.8229e-05       
        playerSosa    6.6633e-02 2.5813e-01 0.000 
Number of obs: 314, groups: month, 7

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -0.7949     0.1195  -6.651 2.91e-11 ***
playerSosa   -0.1252     0.2020  -0.620    0.535    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Correlation of Fixed Effects:
           (Intr)
playerSosa -0.592

The scale parameter (useful to check for possible overdispersion) is available through the following slot:

summary(mod)@sigma

(The equivalent for usual GLM would be summary(glm(...))$dispersion).

More information about mixed-effects modeling as implemented in lme4 can be found on R-forge, in the Mixed-effects models project, or the GLMM FAQ, as suggested by @fabians.

The gamm4 package may also be of interest as it allows to fit generalized additive mixed models.

Paired t-test – Special Case of Linear Mixed-Effect Modeling

The equivalence of the models can be observed by calculating the correlation between two observations from the same individual, as follows:

As in your notation, let $Y_{ij} = \mu + \alpha_i + \beta_j + \epsilon_{ij}$, where $\beta_j \sim N(0, \sigma_p^2)$ and $\epsilon_{ij} \sim N(0, \sigma^2)$. Then $Cov(y_{ik}, y_{jk}) = Cov(\mu + \alpha_i + \beta_k + \epsilon_{ik}, \mu + \alpha_j + \beta_k + \epsilon_{jk}) = Cov(\beta_k, \beta_k) = \sigma_p^2$, because all other terms are independent or fixed, and $Var(y_{ik}) = Var(y_{jk}) = \sigma_p^2 + \sigma^2$, so the correlation is $\sigma_p^2/(\sigma_p^2 + \sigma^2)$.

Note that the models however are not quite equivalent as the random effect model forces the correlation to be positive. The CS model and the t-test/anova model do not.

EDIT: There are two other differences as well. First, the CS and random effect models assume normality for the random effect, but the t-test/anova model does not. Secondly, the CS and random effect models are fit using maximum likelihood, while the anova is fit using mean squares; when everything is balanced they will agree, but not necessarily in more complex situations. Finally, I'd be wary of using F/df/p values from the various fits as measures of how much the models agree; see Doug Bates's famous screed on df's for more details. (END EDIT)

The problem with your R code is that you're not specifying the correlation structure properly. You need to use gls with the corCompSymm correlation structure.

Generate data so that there is a subject effect:

set.seed(5)
x <- rnorm(10)
x1<-x+rnorm(10)
x2<-x+1 + rnorm(10)
myDat <- data.frame(c(x1,x2), c(rep("x1", 10), rep("x2", 10)), 
                    rep(paste("S", seq(1,10), sep=""), 2))
names(myDat) <- c("y", "x", "subj")

Then here's how you'd fit the random effects and the compound symmetry models.

library(nlme)
fm1 <- lme(y ~ x, random=~1|subj, data=myDat)
fm2 <- gls(y ~ x, correlation=corCompSymm(form=~1|subj), data=myDat)

The standard errors from the random effects model are:

m1.varp <- 0.5453527^2
m1.vare <- 1.084408^2

And the correlation and residual variance from the CS model is:

m2.rho <- 0.2018595
m2.var <- 1.213816^2

And they're equal to what is expected:

> m1.varp/(m1.varp+m1.vare)
[1] 0.2018594
> sqrt(m1.varp + m1.vare)
[1] 1.213816

Other correlation structures are usually not fit with random effects but simply by specifying the desired structure; one common exception is the AR(1) + random effect model, which has a random effect and AR(1) correlation between observations on the same random effect.

EDIT2: When I fit the three options, I get exactly the same results except that gls doesn't try to guess the df for the term of interest.

> summary(fm1)
...
Fixed effects: y ~ x 
                 Value Std.Error DF   t-value p-value
(Intercept) -0.5611156 0.3838423  9 -1.461839  0.1778
xx2          2.0772757 0.4849618  9  4.283380  0.0020

> summary(fm2)
...
                 Value Std.Error   t-value p-value
(Intercept) -0.5611156 0.3838423 -1.461839  0.1610
xx2          2.0772757 0.4849618  4.283380  0.0004

> m1 <- lm(y~ x + subj, data=myDat)
> summary(m1)
...
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)  -0.3154     0.8042  -0.392  0.70403   
xx2           2.0773     0.4850   4.283  0.00204 **

(The intercept is different here because with the default coding, it's not the mean of all subjects but instead the mean of the first subject.)

It's also of interest to note that the newer lme4 package gives the same results but doesn't even try to compute a p-value.

> mm1 <- lmer(y ~ x + (1|subj), data=myDat)
> summary(mm1)
...
            Estimate Std. Error t value
(Intercept)  -0.5611     0.3838  -1.462
xx2           2.0773     0.4850   4.283

Best Answer

Related Solutions

Solved – Mixed effects log-linear models

Paired t-test – Special Case of Linear Mixed-Effect Modeling

Related Question