R – How to Simulate Survival Data

r

AFT model for survial time and Cox model for censored time

How can I generate survival data according to the specified models?

Best Answer

Generating $\tilde{T}$ is straightforward from the model. I assume the difficulty lies in sampling $C$ from a Cox model, and you wouldn't be alone in this confusion.

The general methodology used to generate data from a Cox model is inverse CDF sampling (the inverse operation of probability integral transform). The sampling scheme goes as follows:

  1. Generate $u \sim U(0,1)$ where $U(0,1)$ indicates the standard uniform distribution.
  2. Set $c \gets F_C^{-1}(u)$ where $F_C$ is the cumulative distribution function of the censored time $C$.

I'll write the expression inside the exponential function as $h(X,A)$ for convenience. Then the Cox model has the hazard function $\lambda_C(t\mid X,A) = \lambda_0(t)\exp(h(X,A))$, which gives the survival function $S_C(t\mid X,A) = \exp\{-\Lambda_0(t)\exp(h(X,A)) \}$. The CDF and survival function are related to each other: $F_C(t\mid X,A) = 1-S_C(t\mid X,A)$.

Depending on the form of the cumulative hazard $\Lambda_0(t)$, the inverse of $F_C(t\mid X,A)$ may not be available in closed form. If $\Lambda_0^{-1}(x)$ is available, then $F_C^{-1}(u\mid X,A) = \Lambda^{-1}\left[-\log(1-u)\exp(h(X,A)) \right]$. If not, you may want to numerically solve the equation $F_C(t\mid X,A)-u=0$ for $t$ given a uniform variable $u$.