Solved – How to pool c-statistic/AUROC (or any bounded variable) after using multiple imputation techniques

multiple-imputationpooling

I am conducting a study where I am interested in predicting a dichotomous outcome (poor outcome yes/no) for patients in a hospital setting. Specifically, I want to compare how different summary measures for the first week of admission affect the models' discrimination, as measured by the c-index (aka area under the receiver operator curve or AUROC).

As usually happens in clinical studies however, I have missing data on predictor and outcome variables. I have decided to attack this problem by using multiple imputation techniques. This way I have created 50 datasets with replaced missing values (using the 'mice' package in R).

Using the appropriate functions I am able to obtain the c-statistics with confidence interval (& variance) for each imputation dataset.
Using 'plain' Rubins rules for pooling of normally distributed variables I would now average the point estimate and adjust the total variance for the variance between imputation datasets.
Now I come onto the problem: I am unsure whether I can treat the 50 c-indices as normally distributed and calculate point estimate and the variance needed for a proper confidence interval.

I have tried searching for an answer, but I only found the following three suggestions used in (slightly) different situations:

  1. to pool assuming normal distribution anyway (often applied to other statistics which are bounded or definitively not normally distributed);
  2. look at the distribution of statistics over all imputation datasets and take the median c-index as point estimate, while using the 2.5th and 97.5th percentile values as lower and upper bound of a 95% confidence interval.
  3. transform all c-indices and variances to an unbounded scale, pool transformed values assuming normal distribution, and finally transform back to bounded c-index scale (as suggested for the observed:expected ratio by log-transforming in Siregar S – Eur J Cardiothorac Surg 2012). For the $[0, 1]$ bounded c-index this could be done by logit-transformation of the c-indices.

Any help would be greatly appreciated.

Best Answer

The c-index is a useful measure of predictive discrimination because it is easy to interpret and at least moderately sensitive. It is not a full-information proper accuracy scoring rule. It is not sensitive enough for comparing two models. So I suggest you obtain the best model using all the partial information available (e.g., multiple imputation with the number of imputations being at least the percentage of records that are incomplete), then attempt to quantify the value of that single model. That is easier said than done, but you can start with the overall Wald statistic for the global null hypothesis that none of the predictors are associated with $Y$. There are a few papers showing how to derive a unitless discrimination index from the Wald $\chi^2$ statistic. Also take a quick look at the $g$-index in my Regression Modeling Strategies book and notes.