Mixed Model – How to Perform Imputation for a Zero-Inflated Negative Binomial Mixed Effects Model

missing datamixed modelrepeated measureszero inflation

I am working with a dataset of repeated (x4) observations on 100 subjects.

The outcome is zero-inflated and the data appears to be modelled well by a mixed effects zero-inflated negative binomial model with random intercepts for subjects.

However, approximately 20% of the outcome variable is missing, so I am investigating imputation methods that can deal with this. There are 2 covariates in the analysis model and a further 6 auxiliary variables that can be used for imputation – all of which have very low levels of missingness.

I have tried the mice package in R which supports random effects, but does not support the negative binomial distribution.

The only method I have found useful so far, is random hot-deck imputation, which I have tried in R (with the package StatMatch) and Stata (with the package hotdeck) both of which seem to produce reasonable results

Is there any other package/system that can be used to impute values for this kind of model ? I have experience with R and Stata, and SAS is also available to me (though I don't have any experience with it)

Best Answer

I suggest that you use predictive mean matching (PMM), the default for continuous data in mice. PMM often works well for semi-continuous data as it makes no distributional assumptions. Check whether the imputed values look plausible (in the sense that they could have been observed had they not been missing). Then, check whether the intra-class correlation of the complete cases is similar to that calculated from the imputed data. If that is the case, I would say that you're done.

BTW: Do the missing data occur only in the outcome? If that's the case, then you might try to stick to the mixed model and not do any imputations at all.

Related Question