Cox Proportional Hazards – Using Multiple Imputation for Cox Proportional Hazards and Validating with rms Package

cox-modeldata-imputationrrmssurvival

I've been researching the mice package, and I haven't yet discovered a way to use the multiple imputations to make a Cox model, then validate that model with the rms package's validate() function. Here is some sample code of what I have so far, using the data set veteran:

library(rms)
library(survival)
library(mice)

remove(veteran)
data(veteran)
veteran$trt=factor(veteran$trt,levels=c(1,2))
veteran$prior=factor(veteran$prior,levels=c(0,10))

#Set random data to NA 
veteran[sample(137,4),1]=NA
veteran[sample(137,4),2]=NA
veteran[sample(137,4),7]=NA

impvet=mice(veteran)
survmod=with(veteran,Surv(time,status))

#make a CPH for each imputation
for(i in seq(5)){
    assign(paste("mod_",i,sep=""),cph(survmod~trt+celltype+karno+age+prior,
        data=complete(impvet,i),x=T,y=T))
}

#Now there is a CPH model for mod_1, mod_2, mod_3, mod_4, and mod_5.

Now, if I were just working with one CPH model, I would do this:

validate(mod_1,B=20)

The problem I'm having is how to take the 5 CPH models (1 for each imputation), and be able to create a pooled model that I can then use with rms. I know that the mice package has some built-in pooling functions but I don't believe they work with the cph object in rms. The key here is being able to still use rms after pooling. I looked into using Harrell's aregImpute() function but I'm having some trouble following the examples and documentation; mice seems simpler to use.

Best Answer

The fit.mult.impute function in the Hmisc package will draw imputations created from mice just as it will from aregImpute. cph will work with fit.mult.impute. The harder question is how to do validation through resampling when also doing multiple imputation. I don't think anyone has really solved that. I usually take the easy way out and use single imputation to validate the model, using the Hmisc transcan function, but using multiple imputation to fit the final model and to get standard errors.