Solved – How to correctly specify the within-group correlation matrix for linear mixed model in R

lme4-nlmemixed modelmultivariate analysis

I'm using the nlme package's lme function in R to fit a random-intercept, random-slope linear mixed model for some generated test data. Although the fixed effect coefficients are estimated as expected, the variance parameter estimation yields results I do not fully understand.

Specifically, the rho parameter of the compound symmetry correlation structure specified in the model statement and correlation of the random effects are not estimated, as I would expect from the data. When not fixing the correlation in corCompSymm(), the estimated rho is almost equal to default 0, and correlation of random effects is then obviously wrong (0.64 instead of -0.5).

I'm aware, that the variance-covariance parameters are estimated marginally, but is there a way to improve the estimates, given I correctly specify the correlation matrix type (CS, AR, etc.) but do not know it's parameters?

And secondly (somehow unrelated), does R allow for fitting a linear model with no random effects specified, but assumed structure for within-subject correlation matrix (analogous to SAS's PROC MIXED with no RANDOM statement)


The fitted model is:

lme(data = dat, fixed = y ~ as.factor(f) * x, random = ~ 1 + x | id, correlation = corCompSymm(fixed = FALSE))

and the outcome variable is simulated as:

y <- ifelse(f == 0, beta0[1], beta0[2]) +
ifelse(f == 0, beta1[1] * x, beta1[2] * x + b1 * x) +
b0 + b1 * x + e
, where:

  • f is a subject-specific grouping factor with 2 levels
  • x simulates ordinal variable over which the repeated measures have been taken (design is balanced)
  • beta0 and beta1 are vectors of group-specific intercept and slope, consecutively
  • e is an error-term, following multivariate normal with means 0 and compound symmetry variance-covariance matrix, having 1 diagonal and .8 off-diagonal
  • b0 and b1 are subject-specific errors for intercept, and slope consecutively, simulated to follow bivariate standard normal with covariance -0.5:

\begin{bmatrix}
1 & – 0.5\\
– 0.5 & 1
\end{bmatrix}

The estimated parameters:


Correlation Structure: Compound symmetry
Formula: ~1 | id
Parameter estimate(s):
Rho
4.260007e-06

> VarCorr(f3)
id = pdLogChol(1 + x)
Variance StdDev Corr
(Intercept) 0.5682353 0.7538139 (Intr)
x 2.5251728 1.5890792 0.64
Residual 5.0764053 2.2530879


Reproducible code:

require(reshape2)
require(nlme)
require(MASS)

set.seed(1)

n <- 1000 # number of subjects
m <- 4 # rep. measurments per subject

beta0 <- c(2, 5) # intercepts for each subject-specific factor
beta1 <- c(2, 3) # slopes for each subject-specific factor

# correlation matrix for random effects
d <- - .50
D <- matrix(c(1, d,
              d, 1), nrow = 2, byrow = T)

# correlation matrix for error terms
r <- .8
R <- c(1, r, r, r,
       r, 1, r, r,
       r, r, 1, r,
       r, r, r, 1) * 5
R <- matrix(R, nrow = sqrt(length(R)))
R <- R[1:m, 1:m]

dat <- data.frame(id = 1:n)

dat$f <- sample(1:length(beta0) - 1, n, replace = TRUE) # randomly assign a factor for each subject

# assign subject-specific random slopes and intercepts
dat <- cbind(dat, 
             setNames(data.frame(
               mvrnorm(n, mu = rep(0, nrow(D)), Sigma = D)), 
               c('b0', 'b1'))) 

dat <- dat[rep(1:n, each = m), ]
dat$x <- rep(1:m, n)

# error term
dat$e <- as.vector(t(mvrnorm(n, mu = rep(0, nrow(R)), Sigma = R))) # 

# generate outcome
dat$y <- with(dat, 
              ifelse(f == 0, beta0[1], beta0[2]) +
                ifelse(f == 0, beta1[1] * x, beta1[2] * x + b1 * x) +
                b0 + b1 * x + e)

# lmm with assumed compound symmetry and correlation not fixed
summary(f3 <- lme(data = dat, fixed = y ~ as.factor(f) * x, random = ~ 1 + x | id, 
                  correlation = corCompSymm(fixed = FALSE)))
VarCorr(f3)

# lmm with with fixed correlation
summary(f4 <- lme(data = dat, fixed = y ~ as.factor(f) * x, random = ~ 1 + x | id, 
                  correlation = corCompSymm(r, fixed = TRUE)))
VarCorr(f4)    

Best Answer

The answer to your second question (fit correlation structure without random effects) is to use nlme::gls() ("generalized least squares") - it allows the same set of heteroscedasticity (weights argument) and correlation (correlation argument) as lme.