Survival Times – Simulation Through Root Finding

survival

I am simulating survival times from a joint model of longitudinal and survival data,
\begin{equation}
\begin{split}
& Y_i(t) \sim N(\mu_i(t), \sigma_y^2) \\
& \mu_i(t) = \beta_{0i} + \beta_{1i} t + \beta_2 x_{1i} + \beta_3 x_{2i} \\
& \beta_{0i} = \beta_{00} + b_{0i}\\
& \beta_{1i} = \beta_{10} + b_{1i} \\
& (b_{0i}, b_{1i})^T \sim N(0, \Sigma)\\
& h_i(t) = \delta (t^{\delta-1})
\exp (\gamma_0 + \gamma_1 x_{1i} + \gamma_2 x_{2i} + \alpha \mu_i(t)) \\
\end{split}
\end{equation}

I understand that under a constant hazard (exponential) I have an analytic solution by using the inverse-transform principle but I found that I have had to be careful with my choice of coefficients and it will restrict my simulation to an exponential model.

So after my research, I am using package simSurv and I believe uniroot finding is applied underneath. I have to specify an upper bound ( also known as the maximum follow-up time). In this case, survival times exceeding this time is administratively censored.

  1. I cannot find the exact survival time under this method since root-finding depends on the upper bound I specify?
  2. In this case, how can one perform non-informative censoring without knowing the true survival times. My understanding is that, we need to define a censoring distribution. Then do $\min(T_i, C_i)$ for each individual to find the observed survival time but for some individuals we won't know $T_i$, the exact survival time.

Best Answer

As this example model from the simsurv vignette is a Weibull model with proportional hazards, there isn't a problem with simulating exact survival times, provided that the times are less than the upper time limit that you specify. Integrating the instantaneous hazard in the last line of your example over time gives the individual's cumulative hazard as a (continuously increasing) function of time, $H_i(t)$. The upper time limit that you specify just makes sure that the integral will be finite.

The survival function for the individual is $S_i(t) = \exp(-H_i(t))$. An exact survival time is specified by sampling uniformly on (0,1) for a survival probability and then solving that relationship numerically for the corresponding survival time. The upper time limit only comes into play when the sampled value of $S_i(t)$ is so low that the corresponding value of $t$ exceeds the values over which $H_i(t)$ was calculated. Those are the cases "administratively censored" at that upper time limit.

The only individuals for which you won't have exact survival times are those who are "administratively censored" by your choice of the upper time limit. They are treated as having right-censored survival at that upper time limit from the start. You then proceed as you describe to model censoring for all individuals. Some of those "administratively censored" individuals might then end up with even earlier right-censoring times than that upper time limit.

This page provides a bit more detail on a closely related question.