You can do this somewhat by hand if by taking advantage of the lapply
functionality in R and the list-structure returned by the Amelia
multiple imputation package. Here's a quick example script.
library(Amelia)
library(lme4)
library(merTools)
library(plyr) # for collapsing estimates
Amelia
is similar to mice
so you can just substitute your variables in from the mice
call here -- this example is from a project I was working on.
a.out <- amelia(dat[sub1, varIndex], idvars = "SCH_ID",
noms = varIndex[!varIndex %in% c("SCH_ID", "math12")],
m = 10)
a.out
is the imputation object, now we need to run the model on each imputed dataset. To do this, we use the lapply
function in R to repeat a function over list elements. This function applies the function -- which is the model specification -- to each dataset (d) in the list and returns the results in a list of models.
mods <- lapply(a.out$imputations,
function(d) lmer((log(wage) ~ gender + age + age_sqr +
occupation + degree + private_sector + overtime +
(1+gender|faculty), data = d)
Now we create a data.frame from that list, by simulating the values the fixed and random effects using the functions FEsim and REsim from the merTools package
imputeFEs <- ldply(mods, FEsim, nsims = 1000)
imputeREs <- ldply(mods, REsim, nsims = 1000)
The data.frames above include separate estimates for each dataset, now we need to combine them using a collapse like argument collapse
imputeREs <- ddply(imputeREs, .(X1, X2), summarize, mean = mean(mean),
median = mean(median), sd = mean(sd),
level = level[1])
imputeFEs <- ddply(imputeFEs, .(var), summarize, meanEff = mean(meanEff),
medEff = mean(medEff), sdEff = mean(sdEff))
Now we can also extract some statistics on the variance/covariance for the random effects across the imputed values. Here I have written two simple extractor functions to do this.
REsdExtract <- function(model){
out <- unlist(lapply(VarCorr(model), attr, "stddev"))
return(out)
}
REcorrExtract <- function(model){
out <- unlist(lapply(VarCorr(model), attr, "corre"))
return(min(unique(out)))
}
And now we can apply them to the models and store them as a vector:
modStats <- cbind(ldply(mods, REsdExtract), ldply(mods, REcorrExtract))
Update
The functions below will get you much closer to the output provided by arm::display
by operating on the list of lmer
or glmer
objects. Hopefully this will be incorporated into the merTools
package in the near future:
# Functions to extract standard deviation of random effects from model
REsdExtract <- function(model){
out <- unlist(lapply(VarCorr(model), attr, "stddev"))
return(out)
}
#slope intercept correlation from model
REcorrExtract <- function(model){
out <- unlist(lapply(VarCorr(model), attr, "corre"))
return(min(unique(out)))
}
modelRandEffStats <- function(modList){
SDs <- ldply(modList, REsdExtract)
corrs <- ldply(modList, REcorrExtract)
tmp <- cbind(SDs, corrs)
names(tmp) <- c("Imp", "Int", "Slope", "id", "Corr")
out <- data.frame(IntSD_mean = mean(tmp$Int),
SlopeSD_mean = mean(tmp$Slope),
Corr_mean = mean(tmp$Corr),
IntSD_sd = sd(tmp$Int),
SlopeSD_sd = sd(tmp$Slope),
Corr_sd = sd(tmp$Corr))
return(out)
}
modelFixedEff <- function(modList){
require(broom)
fixEst <- ldply(modList, tidy, effects = "fixed")
# Collapse
out <- ddply(fixEst, .(term), summarize,
estimate = mean(estimate),
std.error = mean(std.error))
out$statistic <- out$estimate / out$std.error
return(out)
}
print.merModList <- function(modList, digits = 3){
len <- length(modList)
form <- modList[[1]]@call
print(form)
cat("\nFixed Effects:\n")
fedat <- modelFixedEff(modList)
dimnames(fedat)[[1]] <- fedat$term
pfround(fedat[-1, -1], digits)
cat("\nError Terms Random Effect Std. Devs\n")
cat("and covariances:\n")
cat("\n")
ngrps <- length(VarCorr(modmathG[[1]]))
errorList <- vector(mode = 'list', length = ngrps)
corrList <- vector(mode = 'list', length = ngrps)
for(i in 1:ngrps){
subList <- lapply(modList, function(x) VarCorr(x)[[i]])
subList <- apply(simplify2array(subList), 1:2, mean)
errorList[[i]] <- subList
subList <- lapply(modList, function(x) attr(VarCorr(x)[[i]], "corre"))
subList <- min(unique(apply(simplify2array(subList), 1:2, function(x) mean(x))))
corrList[[i]] <- subList
}
errorList <- lapply(errorList, function(x) {
diag(x) <- sqrt(diag(x))
return(x)
})
lapply(errorList, pfround, digits)
cat("\nError Term Correlations:\n")
lapply(corrList, pfround, digits)
residError <- mean(unlist(lapply(modList, function(x) attr(VarCorr(x), "sc"))))
cat("\nResidual Error =", fround(residError,
digits), "\n")
cat("\n---Groups\n")
ngrps <- lapply(modList[[1]]@flist, function(x) length(levels(x)))
modn <- getME(modList[[1]], "devcomp")$dims["n"]
cat(sprintf("number of obs: %d, groups: ", modn))
cat(paste(paste(names(ngrps), ngrps, sep = ", "),
collapse = "; "))
cat("\n")
cat("\nModel Fit Stats")
mAIC <- mean(unlist(lapply(modList, AIC)))
cat(sprintf("\nAIC = %g", round(mAIC, 1)))
moDsigma.hat <- mean(unlist(lapply(modmathG, sigma)))
cat("\nOverdispersion parameter =", fround(moDsigma.hat,
digits), "\n")
}
Best Answer
It would help a lot to have a reproducible example, but I'll try.
It's correct:
sigma(model)
andsummary(model)$sigma
are identical, the former is a little more compact but not different. (Why do you think it's incorrect?)This just means that
sigma()
was established as a generic function in base R, so the generic function is no longer defined in thelme4
package. Butsigma(fitted_model)
still works the same.What are you trying to do? To reverse your transformation
y=log(x+0.001)
you needx=exp(y)-0.001
. The reason that people sometimes add insigma^2/2
is because analyzing on the log-transformed scale will give a biased estimate of the (arithmetic) mean on the original scale: see e.g. here. My opinion is that this isn't necessarily worth doing unless you're specifically interested in predicting the mean (rather than, say, the median) on the original scale, but it's worth keeping in mind.You shouldn't need to. When in doubt, the documentation of the current version of the package you have installed (
?predict.merMod
) gives the authoritative answer. Do you have a reason to believe thatre.form=NULL
is not doing what you want? (Hint: if you have an overly complex model, or too few levels in your grouping variables, you may have estimated random effects variances of zero, in which case it will look like your random effects aren't doing anything ... checkVarCorr(fitted_model)
, and note that variances are sometimes estimated as being very small (e.g.1e-06
) rather than exactly zero ...)Note that this question might be more appropriate for
r-sig-mixed-models@r-project.org
than CrossValidated (go to the info page to subscribe)