Survival Analysis – Drawing Survival Curves for Two Groups After Multiple Imputation


I wonder how to draw survival curves (Kaplan-Meier) when there is no missing information on the survival variables but on the stratification covariate.
For example, we know for all patients the follow-up and the status at the end of the follow-up but there is missing data on the sex variable. And we need to plot the survival curve for men and women.

How to proceed in this case if we use multiple imputation for the stratification covariate?
Should we combine the survival curve of each group using Rubin's rules with log-log transformation?
In fact, it is possible that an individual belongs to group 1 for a multiple imputation and to group 2 the next imputation, but his survival will always be the same.

If anyone knows reference on this particular problematic. Maybe in the Steff van Buuren's book?

(By the way, I am working on R)

Thanks a lot

Best Answer

This Cross Validated page addresses a similar situation. The idea is to pool a characteristic that is close to normally distributed, use Rubin's Rules to pool among the imputed data sets to get means and error estimates, then back-transform to the desired scale for data display. As David Luke Thiessen notes in a comment, Stef van Buuren recommends pooling a complementary log-log transformation for survival probabilities.

I don't see that imputation of the stratification variable adds any further complexity. The paper cited on the above Cross Validated page as an example of using the complementary log-log transformation, Morisot et al., Prostate cancer: net survival and cause-specific survival rates after multiple imputation, BMC Med Res Methodol 15, 54 (2015), imputed causes of death. That would seem to be at least as troublesome as evaluating survival curves after imputing sex in your example.