You can do this somewhat by hand if by taking advantage of the lapply
functionality in R and the list-structure returned by the Amelia
multiple imputation package. Here's a quick example script.
library(Amelia)
library(lme4)
library(merTools)
library(plyr) # for collapsing estimates
Amelia
is similar to mice
so you can just substitute your variables in from the mice
call here -- this example is from a project I was working on.
a.out <- amelia(dat[sub1, varIndex], idvars = "SCH_ID",
noms = varIndex[!varIndex %in% c("SCH_ID", "math12")],
m = 10)
a.out
is the imputation object, now we need to run the model on each imputed dataset. To do this, we use the lapply
function in R to repeat a function over list elements. This function applies the function -- which is the model specification -- to each dataset (d) in the list and returns the results in a list of models.
mods <- lapply(a.out$imputations,
function(d) lmer((log(wage) ~ gender + age + age_sqr +
occupation + degree + private_sector + overtime +
(1+gender|faculty), data = d)
Now we create a data.frame from that list, by simulating the values the fixed and random effects using the functions FEsim and REsim from the merTools package
imputeFEs <- ldply(mods, FEsim, nsims = 1000)
imputeREs <- ldply(mods, REsim, nsims = 1000)
The data.frames above include separate estimates for each dataset, now we need to combine them using a collapse like argument collapse
imputeREs <- ddply(imputeREs, .(X1, X2), summarize, mean = mean(mean),
median = mean(median), sd = mean(sd),
level = level[1])
imputeFEs <- ddply(imputeFEs, .(var), summarize, meanEff = mean(meanEff),
medEff = mean(medEff), sdEff = mean(sdEff))
Now we can also extract some statistics on the variance/covariance for the random effects across the imputed values. Here I have written two simple extractor functions to do this.
REsdExtract <- function(model){
out <- unlist(lapply(VarCorr(model), attr, "stddev"))
return(out)
}
REcorrExtract <- function(model){
out <- unlist(lapply(VarCorr(model), attr, "corre"))
return(min(unique(out)))
}
And now we can apply them to the models and store them as a vector:
modStats <- cbind(ldply(mods, REsdExtract), ldply(mods, REcorrExtract))
Update
The functions below will get you much closer to the output provided by arm::display
by operating on the list of lmer
or glmer
objects. Hopefully this will be incorporated into the merTools
package in the near future:
# Functions to extract standard deviation of random effects from model
REsdExtract <- function(model){
out <- unlist(lapply(VarCorr(model), attr, "stddev"))
return(out)
}
#slope intercept correlation from model
REcorrExtract <- function(model){
out <- unlist(lapply(VarCorr(model), attr, "corre"))
return(min(unique(out)))
}
modelRandEffStats <- function(modList){
SDs <- ldply(modList, REsdExtract)
corrs <- ldply(modList, REcorrExtract)
tmp <- cbind(SDs, corrs)
names(tmp) <- c("Imp", "Int", "Slope", "id", "Corr")
out <- data.frame(IntSD_mean = mean(tmp$Int),
SlopeSD_mean = mean(tmp$Slope),
Corr_mean = mean(tmp$Corr),
IntSD_sd = sd(tmp$Int),
SlopeSD_sd = sd(tmp$Slope),
Corr_sd = sd(tmp$Corr))
return(out)
}
modelFixedEff <- function(modList){
require(broom)
fixEst <- ldply(modList, tidy, effects = "fixed")
# Collapse
out <- ddply(fixEst, .(term), summarize,
estimate = mean(estimate),
std.error = mean(std.error))
out$statistic <- out$estimate / out$std.error
return(out)
}
print.merModList <- function(modList, digits = 3){
len <- length(modList)
form <- modList[[1]]@call
print(form)
cat("\nFixed Effects:\n")
fedat <- modelFixedEff(modList)
dimnames(fedat)[[1]] <- fedat$term
pfround(fedat[-1, -1], digits)
cat("\nError Terms Random Effect Std. Devs\n")
cat("and covariances:\n")
cat("\n")
ngrps <- length(VarCorr(modmathG[[1]]))
errorList <- vector(mode = 'list', length = ngrps)
corrList <- vector(mode = 'list', length = ngrps)
for(i in 1:ngrps){
subList <- lapply(modList, function(x) VarCorr(x)[[i]])
subList <- apply(simplify2array(subList), 1:2, mean)
errorList[[i]] <- subList
subList <- lapply(modList, function(x) attr(VarCorr(x)[[i]], "corre"))
subList <- min(unique(apply(simplify2array(subList), 1:2, function(x) mean(x))))
corrList[[i]] <- subList
}
errorList <- lapply(errorList, function(x) {
diag(x) <- sqrt(diag(x))
return(x)
})
lapply(errorList, pfround, digits)
cat("\nError Term Correlations:\n")
lapply(corrList, pfround, digits)
residError <- mean(unlist(lapply(modList, function(x) attr(VarCorr(x), "sc"))))
cat("\nResidual Error =", fround(residError,
digits), "\n")
cat("\n---Groups\n")
ngrps <- lapply(modList[[1]]@flist, function(x) length(levels(x)))
modn <- getME(modList[[1]], "devcomp")$dims["n"]
cat(sprintf("number of obs: %d, groups: ", modn))
cat(paste(paste(names(ngrps), ngrps, sep = ", "),
collapse = "; "))
cat("\n")
cat("\nModel Fit Stats")
mAIC <- mean(unlist(lapply(modList, AIC)))
cat(sprintf("\nAIC = %g", round(mAIC, 1)))
moDsigma.hat <- mean(unlist(lapply(modmathG, sigma)))
cat("\nOverdispersion parameter =", fround(moDsigma.hat,
digits), "\n")
}
The general principle here is that it only makes sense/is only possible to estimate within-level variation for factors that actually vary within that level in the course of the experiment/observation period. Since happiness might vary within individuals across trials, but race can't, the maximal model you can fit would be
lmer(liking~happiness*race + (happiness|participant), data)
In other words, the effect of happiness on liking may vary across individuals; you will get estimates for
- (fixed) average effect of happiness on liking for the "baseline" race (whichever is the first level of your factor)
- (fixed) average effect of race (differences in liking from the baseline race) on liking for happiness zero (you might want to center happiness so that the 'zero' level of happiness is a more meaningful level [e.g. a baseline level of 4, or a baseline level of the mean happiness across the study population]
- (fixed) race-happiness interaction (the average difference in the happiness-liking slope between the baseline and other races)
- (random) among-individual intercept variation (difference in the expected liking of an individual at the baseline happiness from the expected liking for an individual of their race at the baseline happiness)
- (random) among-individual happiness-slope variation (difference in the expected effect of 1 unit of liking on happiness from the average effect for an individual of their race)
You could change the interpretation of the fixed factors slightly by changing contrasts.
To stretch a point a little bit, it might be possible in principle that the effect of race, or the happiness by race interaction, could vary across individuals, but you can't measure it. (This discussion would make more sense if considered in terms of a characteristic that is more likely to vary within individuals over some reasonable time scale but doesn't vary within individuals within the scope of the experiment.)
Best Answer
Here I fit an example model using the built-in sleepstudy dataset.
These are the confidence intervals (on the right, I list what each interval pertains to):