Solved – How does the mice imputation function work

bayesiandata-imputationmicerregression

I was wondering if anyone had experience using the mice function, as described in mice: Multivariate Imputation by Chained Equations in R (JSS 2011 45(3))? I have a dataset with a number of variables, each with varying degrees of missing data.

My primary question is: say I use Bayesian linear regression to impute missing data, does mice automatically use predictor variables from most significant to least significant to impute? Also, is it common to perhaps average all the imputed datasets?

Best Answer

By default, mice will use all the variables in your dataset to predict any other one.

As for averaging, you need to do this after calculating your stats, not before. For instance, if you want to do a linear regression, you'd do something like this:

library(mice)
mi <- mice(dataset)
mi.reg <- with(data=mi,exp=glm(y~x+z))
mi.reg.pool <- pool(mi.reg)
summary(mi.reg.pool)

The summary function will show you the averaged coefficients.

Related Question