Solved – How to know which imputation is best for impute the dataset from Multiple imputation by using mice

data-imputationmicemissing datamultiple-imputationr

I used mice package to impute the missing value as follows:

install.packages("mice")
    library ("mice")
nhanes    
   age  bmi hyp chl
1    1   NA  NA  NA
2    2 22.7   1 187
3    1   NA   1 187
4    3   NA  NA  NA
5    1 20.4   1 113
6    3   NA  NA 184
7    1 22.5   1 118
8    1 30.1   1 187
9    2 22.0   1 238
10   2   NA  NA  NA
11   1   NA  NA  NA
12   2   NA  NA  NA
13   3 21.7   1 206
14   2 28.7   2 204
15   1 29.6   1  NA
16   1   NA  NA  NA
17   3 27.2   2 284
18   2 26.3   2 199
19   1 35.3   1 218
20   3 25.5   2  NA
21   1   NA  NA  NA
22   1 33.2   1 229
23   1 27.5   1 131
24   3 24.9   1  NA
25   2 27.4   1 186 
# imputing the data by using mice
imp=mice(nhanes,**10**) # 10 is mean 10 iteration imputing data (m=10)
fill1=complete(imp,1)  # iteration 1
fill2=complete(imp,2)  # iteration 2
allfill=complete(imp,"long") # all iterations together 

I want to know how to choose imputation (in here I have 10 iterations m=10) as final result to impute the missing data set or by another meaning which imputation is best to impute missing data set ??

And which number of m is feasible and why ? , in here I used 10 iterations (m=10)

imp=mice(df,10) # 10 is mean 10 iteration imputing data 

Also I want some illustrations about analyzing imputations and pooling , how can I benift from the result of analyse that I showed here :

Analyse the result

## Fit models for each imputed dataset
 fit <- with(data = imp, exp = lm(bmi ~ hyp + chl))
 ## Pool results
 poolFit <- pool(fit)
 summary(poolFit)

Best Answer

If you want to choose a single imputed dataset to work with, you should go for single imputation instead. But many authors recommended to use multiple imputation and the estimates will be pooled using Rubin's rule which taken into account between and within variances. Rule of thumb for choosing the number of imputation is one imputation per percent of incomplete data (White et al.,2011)

Related Question