I would like to fit data based on Cox proportional-hazards model and then simulate new data based on a fitted model. For instance, provided that the fit is
# Create the simplest test data set
test1 <- list(time=c(4,3,1,1,2,2,3),
status=c(1,1,1,0,1,1,0),
x=c(0,2,1,1,1,0,0))
# Fit a survival model
mod_cox <- coxph(Surv(time, status) ~ x,data=test1)
how would I made simulations based on the fitted mod_cox
model using x
as covariates?
Additionally, I would also like to make additional simulations assuming a Cox model with a hazard rate equal to a constant (say $K$) times the hazard rate of the mod_cox
fitted Cox model.
I am interested in simulating censoring times because I want to perform a power analysis and I need censoring times to replicate survival analyses several times.
Best Answer
For power analysis of a Cox model you do not need to do simulations of event or censoring times.
If the covariate
x
of interest were binary then there would be several ways to proceed, as typical power analysis programs for Cox models are for treatment/control comparisons that can be reasonably extended to other binary covariates. But yourx
seems to be multi-valued, perhaps continuous, and by the structure of your model it is assumed to be linearly related to log hazard.For planning a future observational study based on your current data set with a non-binary covariate, you need to take into account both the hazard change per unit change in the covariate value and the distribution of the values of that covariate among your population of interest. If there are additional covariates in your model, the association of your covariate of interest with those other covariates must also be considered.
The R
powerSurvEpi
package provides tools to handle this type of situation. For study planning itsssizeEpiCont
function is designed to work with a pilot study from which it will estimate the variance of the covariate of interest, its multiple correlation with other covariates, and the fraction of cases that had an event. Specify the significance level, hypothesize the hazard ratio (for example, based on what you found inmod_cox
), and the function will calculate the number of cases needed. If you want to calculate power instead, the package'spowerEpiCont
function provides similar handling for calculating power based on a given a number of cases. The formulas used are shown clearly in the manual pages for those functions.These calculations are based on a paper by Hsieh and Lavori, who reported:
That's a critically important point about survival analysis: it's the number of events that provides the power.
You thus do not have to be concerned further about the timings or numbers of the censored cases. The values passed to the functions noted above don't even require the event times from the pilot study, just which cases had events. The total number of cases needed for a specified power is then related to number of events that is needed and the fraction of cases that had events. More complicated simulations are not required.