Solved – Conditional expected lifetime in survival analysis

predictionrsurvival

I want to do survival analysis with the Colon Cancer data in survival package:

library(survival)
Colon<-subset(colon,etype==2 & rx!='Obs')
fit<-survreg(Surv(time,status)~rx+age+sex,data=Colon, dist="weibull")
Now how can I predict conditional expected lifetime given time, rx, age and sex?

For example, if a 50 years old, male patient under Levamisole treatment has already survived for 2000 days, what would be his expected life time?

I've tried several parameters in predict function, none of them seem to give what I want directly, and I don't know which to choose so I can make some transformation.

Best Answer

One possible approach would be to predict the median survival time (i.e. the expected time at which the survival for a patient with a specific combination of characteristics would be 0.5, in other words, from that timepoint onwards it is more likely that the patient has died than that he is still alive). This can be done using functions from the rms package as follows:

library(rms)

# datadist is needed to fit the models:
dd <- datadist(colon)
options(datadist="dd")

fit <- psm(Surv(time,status)~rx+age+sex,data=colon, dist="weibull", x=TRUE, y=TRUE)

# The Quantile-function, creates a new function using your model for getting 
# quantiles of the predicted survival (by default median)
QFUN <- Quantile(fit)

# Now use QFUN to get the predicted median survival for the patients in your data
pred.med.surv <- QFUN(predict(fit,type="lp"))

# When asking for only one quantile (here 0.5), it drops dimensions of the output, so 
# you would have to use the following to bind the new variable to your data.frame 
colon$pred.med.surv <- as.numeric(dimnames(pred.med.surv)[[2]])

# For multiple quantiles of predicted survival you can use the following:
preds <- QFUN(q=c(0.25,0.5), predict(fit,type="lp"))
colon$surv0.25 <- preds[,"0.25"]
colon$surv0.5 <- preds[,"0.50"] 

A few notes:

  • I see that you delete patients who died 'etype=2' from the dataset. This could lead to bias. In this case you want to evaluate recurrence, death is actually a competing risk for recurrence which calls for different type of survival analysis. Then it would be advisable to call in expert help.

  • The predicted survival time depends on the patients in the dataset you used to develop the model. If e.g. the model was build using measurements of patients at the day of their diagnosis (day 0), you should also enter the predictor values of the patients at day 0 and not those at day 2000. Since the example patient you mention has already survived up to day 2000, the model would not be valid for him, as his survival is probably longer. Or you could for example include a predictor variable indicating the time since diagnosis.

  • As I don't know the details of your research project, I would advise you to consult a statistician to determine a valid approach for your research project.

Hope this helps.

Related Question