Solved – How to combine/pool binomial confidence intervals after multiple imputation

binomial distributionconfidence intervalmicemultiple-imputation

After I multiply imputed my dataset m times I wanted to calculate a binomial proportion confidence interval. How I can I combine the various estimates of the confidence intervals while taking Rubins rules into account?

Best Answer

This is indeed an interesting problem. The issue is that the standard errors that are based on the central limit theorem for proportions are often undesirable because proportions are a computed quantity and for that reason exhibit skewed uncertainty over sampling. The Wilson score, such as you mentioned, gets around the skewness by estimating a different quantity than the standard proportion $k/n$. What you need to use Rubin's rules is an estimate of the within-imputation variance of this transformed proportion, which is just the variance/standard error estimated on a single dataset, along with the transformed proportion itself for each dataset.

So for the Wilson score interval, you first need to calculate the transformed estimate $ \hat{p} + \frac{1}{2n}z^2 $ and then separately the variance, which from your formula is $ (z\sqrt{\frac{1}{n} \hat{p}(1-\hat{p}) + \frac{1}{4n^2}z^2})^2 $

That will give you estimates of the transformed parameter and the transformed parameter's variance for each of $m$ datasets.

You can then combine these estimates using some of the available R tools, such as mi.meld from Amelia or mice as you mentioned or the R package mitools. Then once you have the transformed parameters, you can compute the confidence interval based on the newly derived variance/parameter estimate.

This would be easier if these R packages supplied the transformed estimates instead of just the confidence intervals, but you can probably dig them out of the associated R code.

Best Answer

Related Solutions

Solved – How to combine confidence intervals for a variance component of a mixed-effects model when using multiple imputation

P-Value Pooling – How to Pool Bootstrapped P-values Across Multiply Imputed Data Sets

Related Question