Solved – R lme4 interpret MLM confidence intervals

confidence intervallme4-nlmemultilevel-analysisr

I'm using the lme4 package in R to run a multilevel model and to calculate confidence intervals:

model <- lmer(cadence ~ fd + height + (1+fd|id), data=df, REML=F)

confint.merMod(model, method = "boot", boot.type = "perc", nsim = 500, oldNames = F)

I'm trying to understand what the output is showing me:

                             2.5 %      97.5 %
sd_(Intercept)|id      0.590230649  0.92396467
cor_fd.(Intercept)|id -0.999990953  1.00000000
sd_fd|id               0.003862849  0.10373958
sigma                  0.295910811  0.35343478
(Intercept)           -0.247134271  0.27844609
fd                    -0.038814987  0.04487689
height                -0.795405634 -0.28511542

The fd and height rows are showing my the .95CI around the respective regression coefficient (fixed effects)

The ones I'm mainly struggling with are sd_(Intercept)|id, cor_fd.(Intercept)|id, and sd_fd|id. What do they represent?

My best guess is that sd_(Intercept)|id is the CI around the standard deviation of intercept values. If so, then what does sigma represent?

sd_fd|id Im guessing is the SD of all the estimated slope values for each id

Finally, I'm not sure what cor_fd.(Intercept)|id is meant to show?

Best Answer

Here I fit an example model using the built-in sleepstudy dataset.

(fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy))
#Linear mixed model fit by REML ['lmerMod']
#Formula: Reaction ~ Days + (Days | Subject)
#   Data: sleepstudy
#REML criterion at convergence: 1743.628
#Random effects:
# Groups   Name        Std.Dev. Corr
# Subject  (Intercept) 24.740       
#          Days         5.922   0.07
# Residual             25.592       
#Number of obs: 180, groups:  Subject, 18
#Fixed Effects:
#(Intercept)         Days  
#     251.41        10.47

These are the confidence intervals (on the right, I list what each interval pertains to):

confint.merMod(fm1, method = "boot", boot.type = "perc", nsim = 500, oldNames = F)
#Computing bootstrap confidence intervals ...
#                                   2.5 %      97.5 %
#sd_(Intercept)|Subject        13.1760846  35.3154169 --> random intercept SD
#cor_Days.(Intercept)|Subject  -0.5739618   0.9384943 --> correlation between random intercept and random slope
#sd_Days|Subject                3.2990883   8.1234449 --> random slope SD
#sigma                         22.6901363  28.3167813 --> residual SD
#(Intercept)                  237.9993456 265.4882890 --> fixed intercept
#Days                           7.1383003  13.6773242 --> fixed slope

Related Solutions

Solved – lmer with multiply imputed data

You can do this somewhat by hand if by taking advantage of the lapply functionality in R and the list-structure returned by the Amelia multiple imputation package. Here's a quick example script.

library(Amelia)
library(lme4)
library(merTools)
library(plyr) # for collapsing estimates

Amelia is similar to mice so you can just substitute your variables in from the mice call here -- this example is from a project I was working on.

 a.out <- amelia(dat[sub1, varIndex], idvars = "SCH_ID", 
            noms = varIndex[!varIndex %in% c("SCH_ID", "math12")], 
            m = 10)

a.out is the imputation object, now we need to run the model on each imputed dataset. To do this, we use the lapply function in R to repeat a function over list elements. This function applies the function -- which is the model specification -- to each dataset (d) in the list and returns the results in a list of models.

 mods <- lapply(a.out$imputations,
           function(d) lmer((log(wage) ~ gender + age + age_sqr + 
            occupation + degree + private_sector + overtime + 
             (1+gender|faculty), data = d)

Now we create a data.frame from that list, by simulating the values the fixed and random effects using the functions FEsim and REsim from the merTools package

imputeFEs <- ldply(mods, FEsim, nsims = 1000)
imputeREs <- ldply(mods, REsim, nsims = 1000)

The data.frames above include separate estimates for each dataset, now we need to combine them using a collapse like argument collapse

imputeREs <- ddply(imputeREs, .(X1, X2), summarize, mean = mean(mean), 
               median = mean(median), sd = mean(sd), 
               level = level[1])

imputeFEs <- ddply(imputeFEs, .(var), summarize, meanEff = mean(meanEff), 
               medEff = mean(medEff), sdEff = mean(sdEff))

Now we can also extract some statistics on the variance/covariance for the random effects across the imputed values. Here I have written two simple extractor functions to do this.

REsdExtract <- function(model){
  out <- unlist(lapply(VarCorr(model), attr, "stddev"))
  return(out)
}

REcorrExtract <- function(model){
  out <- unlist(lapply(VarCorr(model), attr, "corre"))
  return(min(unique(out)))
}

And now we can apply them to the models and store them as a vector:

modStats <- cbind(ldply(mods, REsdExtract), ldply(mods, REcorrExtract))

Update

The functions below will get you much closer to the output provided by arm::display by operating on the list of lmer or glmer objects. Hopefully this will be incorporated into the merTools package in the near future:

# Functions to extract standard deviation of random effects from model
REsdExtract <- function(model){
  out <- unlist(lapply(VarCorr(model), attr, "stddev"))
  return(out)
}

#slope intercept correlation from model
REcorrExtract <- function(model){
  out <- unlist(lapply(VarCorr(model), attr, "corre"))
  return(min(unique(out)))
}

modelRandEffStats <- function(modList){
  SDs <- ldply(modList, REsdExtract)
  corrs <- ldply(modList, REcorrExtract)
  tmp <- cbind(SDs, corrs)
  names(tmp) <- c("Imp", "Int", "Slope", "id", "Corr")
  out <- data.frame(IntSD_mean = mean(tmp$Int), 
                        SlopeSD_mean = mean(tmp$Slope), 
                    Corr_mean = mean(tmp$Corr), 
                        IntSD_sd = sd(tmp$Int),
                    SlopeSD_sd = sd(tmp$Slope), 
                        Corr_sd = sd(tmp$Corr))
  return(out)
}

modelFixedEff <- function(modList){
  require(broom)
  fixEst <- ldply(modList, tidy, effects = "fixed")
  # Collapse
  out <- ddply(fixEst, .(term), summarize,
               estimate = mean(estimate), 
               std.error = mean(std.error))
  out$statistic <- out$estimate / out$std.error
  return(out)
}

print.merModList <- function(modList, digits = 3){
  len <- length(modList)
  form <- modList[[1]]@call
  print(form)
  cat("\nFixed Effects:\n")
  fedat <- modelFixedEff(modList)
  dimnames(fedat)[[1]] <- fedat$term
  pfround(fedat[-1, -1], digits)
  cat("\nError Terms Random Effect Std. Devs\n")
  cat("and covariances:\n")
  cat("\n")
  ngrps <- length(VarCorr(modmathG[[1]]))
  errorList <- vector(mode = 'list', length = ngrps)
  corrList <- vector(mode = 'list', length = ngrps)
  for(i in 1:ngrps){
    subList <- lapply(modList, function(x) VarCorr(x)[[i]])
    subList <- apply(simplify2array(subList), 1:2, mean)
    errorList[[i]] <- subList
    subList <- lapply(modList, function(x) attr(VarCorr(x)[[i]], "corre"))
    subList <- min(unique(apply(simplify2array(subList), 1:2, function(x) mean(x))))
    corrList[[i]] <- subList
  }
  errorList <- lapply(errorList, function(x) {
    diag(x) <- sqrt(diag(x))
    return(x)
    })

  lapply(errorList, pfround, digits)
  cat("\nError Term Correlations:\n")
  lapply(corrList, pfround, digits)
  residError <- mean(unlist(lapply(modList, function(x) attr(VarCorr(x), "sc"))))
  cat("\nResidual Error =", fround(residError,
                                             digits), "\n")
  cat("\n---Groups\n")
  ngrps <- lapply(modList[[1]]@flist, function(x) length(levels(x)))
  modn <- getME(modList[[1]], "devcomp")$dims["n"]
  cat(sprintf("number of obs: %d, groups: ", modn))
  cat(paste(paste(names(ngrps), ngrps, sep = ", "),
            collapse = "; "))
  cat("\n")
  cat("\nModel Fit Stats")
  mAIC <- mean(unlist(lapply(modList, AIC)))
  cat(sprintf("\nAIC = %g", round(mAIC, 1)))
  moDsigma.hat <- mean(unlist(lapply(modmathG, sigma)))
  cat("\nOverdispersion parameter =", fround(moDsigma.hat,
                                             digits), "\n")
}

Solved – Interactions between levels in lme4

The general principle here is that it only makes sense/is only possible to estimate within-level variation for factors that actually vary within that level in the course of the experiment/observation period. Since happiness might vary within individuals across trials, but race can't, the maximal model you can fit would be

lmer(liking~happiness*race + (happiness|participant), data)

In other words, the effect of happiness on liking may vary across individuals; you will get estimates for

(fixed) average effect of happiness on liking for the "baseline" race (whichever is the first level of your factor)
(fixed) average effect of race (differences in liking from the baseline race) on liking for happiness zero (you might want to center happiness so that the 'zero' level of happiness is a more meaningful level [e.g. a baseline level of 4, or a baseline level of the mean happiness across the study population]
(fixed) race-happiness interaction (the average difference in the happiness-liking slope between the baseline and other races)
(random) among-individual intercept variation (difference in the expected liking of an individual at the baseline happiness from the expected liking for an individual of their race at the baseline happiness)
(random) among-individual happiness-slope variation (difference in the expected effect of 1 unit of liking on happiness from the average effect for an individual of their race)

You could change the interpretation of the fixed factors slightly by changing contrasts.

To stretch a point a little bit, it might be possible in principle that the effect of race, or the happiness by race interaction, could vary across individuals, but you can't measure it. (This discussion would make more sense if considered in terms of a characteristic that is more likely to vary within individuals over some reasonable time scale but doesn't vary within individuals within the scope of the experiment.)

Best Answer

Related Solutions

Solved – lmer with multiply imputed data

Solved – Interactions between levels in lme4

Related Question