Cox Model – Simulating Two-Component Cox Proportional Hazards Model for Survival Analysis

cox-modelsimulationsurvival

In an article that I have been reading, they have a simulation study:

In this simulation, we generate $T_i$ from the following
group-specific linear transformation model: $$H(T_i) = \beta_{k,1}
X_{i,1} + \beta_{k,2} X_{i,2} + \varepsilon_i, i = 1, 2, \ldots, n;
\quad k = 1, 2 $$ where $ H(t) = \log\left(2(e^{4t} – 1)\right)$ and
$\varepsilon_i $follows the standard extreme-value distribution. In
this case, the linear transformation model is equivalent to the Cox
proportional hazards model. We generate samples from a two-component
Cox proportional hazards model with mixing weights $ \pi_1 =
\frac{1}{3}$, $\pi_2 = \frac{2}{3}$, and $ \beta_1
= (-3, -2)^T$, $\beta_2 = (1, 1)^T$. The covariates $X_i$ are generated from a multivariate normal distribution with a mean of zero
and a first-order autoregressive structure $ \Sigma = (\sigma_{st})$
with $ \sigma_{st} = 0.5^{|s – t|}$ for $ s, t = 1, 2$. The censoring
time is generated from a uniform distribution on $[0, C]$, where $C$
is chosen to achieve censoring proportions of $5\%$ and $25\%$.

My question is, how would I generate the survival time? Here is my approach:

Transformation Functions: I start by inverting the transformation model to simulate survival time.
$$
H(t) = \log(2(e^{4t} – 1)), \\
H^{-1}(y) = \frac{1}{4} \log\left(\frac{e^y}{2} + 1\right).
$$

Model Parameters:
The model involves two components with mixing weights
$$\pi_1 = \frac{1}{3}, \quad \pi_2 = \frac{2}{3},
$$
and parameter vectors $$ \beta_1 = (-3, -2)^{\top}, \quad \beta_2 = (1, 1)^{\top}$$

Covariate Generation:
$\Sigma$ is the covariance matrix of the covariate variables, and therefore, the covariate is generated from a multivariate normal (MVN)
$$\Sigma = \begin{pmatrix} 1 & 0.5 \\ 0.5 & 1 \end{pmatrix}, \\
X \sim \text{MVN}(\mu = (0, 0)^{\top}, \Sigma)$$

Group Assignment and Survival Time Simulation:
$$\text{group}_i = \begin{cases} 1 & \text{if } U_i < \pi_1 \\ 2 & \text{otherwise} \end{cases}$$, where
$$ U_i \sim \text{Uniform}(0, 1)$$

$$X\beta = \begin{cases} \beta_1 \cdot X_i + \epsilon_i & \text{if group}_i = 1
\\ \beta_2 \cdot X_i + \epsilon_i & \text{if group}_i = 2 \end{cases}.
$$
where $$\epsilon_i \sim \text{Extreme Value Distribution}(0, 1)$$

Edit:

Generating survival times to simulate Cox proportional hazards models

To generate event times from the proportional hazards model, we can
use the inverse probability method (Bender et al., 2005): if $V$
is uniform on $(0, 1)$ and if $S(\cdot \,|\, \mathbf{x})$ is the
conditional survival function derived from the proportional hazards
model, i.e. $$ S(t \,|\, \mathbf{x}) = \exp \left( -H_0(t)
\exp(\mathbf{x}^\prime \mathbf{\beta}) \vphantom{\Big(} \right) $$
then it is a fact that the random variable $$ T = S^{-1}(V \,|\,
\mathbf{x}) = H_0^{-1} \left( – \frac{\log(V)}{\exp(\mathbf{x}^\prime
\mathbf{\beta})} \right) $$ has survival function $S(\cdot \,|\,
\mathbf{x})$. This result is known as “the inverse probability
integral transformation''. Therefore, to generate a survival time $T
\sim S(\cdot \,|\, \mathbf{x})$ given the covariate vector, it
suffices to draw $v$ from $V \sim \mathrm{U}(0, 1)$ and to make the
inverse transformation $t = S^{-1}(v \,|\, \mathbf{x})$.

In this article titled "On the linear transformation model for censored data" this is written:

Recently Cheng, Wei & Ying (1995, 1997) developed a class of simple inference procedures for semiparametric linear transformation models with censored survival data. Specifically, let $T$, $C$ and $Z$ denote the failure time, the censoring time and the $p \times 1$ covariate vector. Let $h(\cdot)$ be an unknown increasing function. A linear transformation model is
\begin{equation}
h(T) = Z^T\beta + \epsilon, \quad (1)
\end{equation}
where $\epsilon$ has a completely known density $f$ and distribution function $F$, and $\beta$ is the vector of unknown regression coefficients. If $F(s) = 1 – \exp \{ – \exp(s) \}$, an extreme value distribution, (1) is the proportional hazards model (Cox, 1972). Note that, if $S_Z(t)$ is the survival function of $T$ for given $Z$, then (1) can be rewritten as
\begin{equation}
g\{S_Z(t)\} = h(t) – Z^T\beta, \quad (2)
\end{equation}
where $g^{-1}(\cdot) = 1 – F(\cdot)$.

So based on the above Cox model can be written as
\begin{equation}
\log[-\log \{S_Z(t)\}] = h(t) + Z^T\beta,
\end{equation}

Now, if I were to find $S_{Z}^{-1}$,
\begin{equation}
S^{-1}(u) = h^{-1}(\log(-\log(u)) – Z^T\beta)
\end{equation}

In my question notition, the survival time will be generated as,
\begin{equation}
T=S^{-1}(V|X) = H^{-1}(\log(-\log(V)) – X^T\beta)
\end{equation}
where $H^{-1}(y)= \frac{1}{4} \log\left(\frac{e^y}{2} + 1\right)$ and $V \sim U(0,1)$

Update
Thank you, @Lukas Lohse, for your answer. Also, thank you, @EdM, for taking a look at the question. I found an article that used a similar model for a different reason, and R code for the simulation is available online (which is in the second simulation setup). You see in their simulation how survival time is set up. I am wondering how they came up with this setup and it seems more accurate.

library(survival)
library(mvtnorm)

H_inv <- function(y){
  1/4 * (log(exp(y)/2 + 1))
}

b1 <- c(-3, -2)
n <- 10^5
replications <- 1000

First:

coefficients_m1 <- matrix(NA, nrow = replications, ncol = 2)
for (i in 1:replications) {
  X <- rmvnorm(n, sigma = rbind(c(1, 0.5), c(0.5, 1)))
  lin_pred <- as.vector(X %*% b1)
  error <- log(-1*log(runif(n)))
  times <- H_inv(error - lin_pred)
  
  m1 <- coxph(Surv(time = times, event = rep(TRUE, n)) ~ X[, 1] + X[, 2])
  coefficients_m1[i, ] <- coefficients(m1)
}

mean_coefficients_m1 <- colMeans(coefficients_m1)
bias_m1 <- mean_coefficients_m1 - b1

list(mean_coefficients = mean_coefficients_m1, bias = bias_m1)

results

$mean_coefficients
[1] -2.913310 -1.942282

$bias
[1] 0.08669012 0.05771758

Second :

coefficients_m2 <- matrix(NA, nrow = replications, ncol = 2)
for (i in 1:replications) {
  X <- rmvnorm(n, sigma = rbind(c(1, 0.5), c(0.5, 1)))
  lin_pred <- as.vector(X %*% b1)
  temp = rexp(n)
  times = as.numeric(0.5*log(2*temp*exp(-lin_pred)+1.0))
  
  m2 <- coxph(Surv(time = times, event = rep(TRUE, n)) ~ X[, 1] + X[, 2])
  coefficients_m2[i, ] <- coefficients(m2)
}

mean_coefficients_m2 <- colMeans(coefficients_m2)
bias_m2 <- mean_coefficients_m2 - b1

list(mean_coefficients = mean_coefficients_m2, bias = bias_m2)

results

 $mean_coefficients

 [1] -2.99789 -1.99855

 $bias
 [1] 0.002110146 0.001449510

Best Answer

While I'm interested in seeing what @EdM has to say, this question seems like a straightforward yes, your approach works. I have coded up a version, limited to one group, in R and both the coefficients get recovered and we can find $H(t)$ in the baseline hazard.

library(survival)
library(mvtnorm)

H_inv <- function(y){
  1/4 * (log(exp(y)/2 + 1))
}
b1 <- c(-3, -2)
# large n for the simulation so we can check the coefficients
n <- 10^5
# simulate
X <- rmvnorm(n, sigma = rbind(c(1, 0.5), c(0.5, 1)))
cov(X)
colMeans(X)

lin_pred <- as.vector(X %*% b1)
error <- log(-1*log(runif(n)))
times <- H_inv(error - lin_pred)

m1 <- coxph(Surv(time = times, event = rep(TRUE, n)) ~ X[, 1] + X[, 2])
summary(m1)

result:

            coef exp(coef)  se(coef)      z Pr(>|z|)    
X[, 1] -2.912249  0.054353  0.007981 -364.9   <2e-16 ***
X[, 2] -1.943424  0.143213  0.005986 -324.7   <2e-16 ***

Looking at the baseline hazard:

# H(t) is the (log) cumulative baseline-Hazard,
bhz <- basehaz(m1)
plot(bhz[, 2], log(bhz[, 1]))
curve(log(2*(exp(4*x) - 1)), add = T, col = 2)

result:

Related Solutions

Cox Model – Extrapolating Effect of Covariable Changes in Cox Proportional Hazards Models

I would suggest you do it non-parametrically. The procedure as you describe it imposes assumptions on the way the failure functions can relate to each other, basically because the Cox model introduces the assumption of proportional hazards. Therefore, I would argue that the red and black curves in the plot are a visualization of the model, more than they are estimates of failure functions. Not that those two things couldn't coincide, but why make this further assumption?

If you want to do something similar but non-parametrical, I would suggest using the Kaplan-Meier estimates instead. You would have to divide the weight variable into groups (assuming it's continuous), e.g. "low" and "high". You would still be able to do the counterfactual analysis that you want, simply by making a "conditional" KM plot similar to the green one above. So the green curve would be the KM of the "high" group until age $40$. At age $40$ the KM of the "low" kgs group (for $+40$ years) would continue, pasted onto the "high" ending at $40$. The KM estimate is the estimated probability of reaching age $t$, thus, for the hypothetical individual changing weight groups we can think of the probability of reaching age $40 + s$ as the probability of living from $40$ to $40 + s$ in the low weight group given survival until $40$ times the probability of living from $0$ to $40$ in the high weight group. This will exactly correspond to "pasting" the KM estimates together at age $40$. Note that the KM estimates themselves are products of conditional probabilities (conditional on survival until some time point). In symbols and if $X$ is a stochastic variable describing the time of failure of this hypothetical individual:

$$ P(X > 40 + s) = P(X > 40 + s | X > 40)P(X > 40), \ s \geq 0. $$

In conclusion, this amounts to the KM plot for "high" until age $40$ and at $40$ we use the conditional survival history of "low" (conditional on survival until $40$). To show it on a plot:

Conditional KM estimate of (highly) hypothetical subject

Some code to produce the plot, using built-in functions in R

library(ggplot2)
library(survMisc)
library(survival)


X1 <- rexp(n = 20)*50
X2 <- rexp(n = 20)*100

Sfit1 <- survfit(Surv(time = X1) ~ 1)
Sfit2 <- survfit(Surv(time = X2[X2 > 40]) ~ 1)

v  <- autoplot(Sfit1)$plot
p1 <- tail(v$data$surv[v$data$time < 40], 1)
t1 <- tail(v$data$time[v$data$time < 40], 1)


u <- autoplot(Sfit2)$plot
x <- c(t1, as.vector(u$data$time)[-1])



Sdata <- data.frame(x = x, y = p1*as.vector(u$data$surv), st = "2")

autoplot(Sfit1, title=NULL)$plot + geom_step(data=Sdata, aes(x=x, y=y, st=st))

However, one should probably still consider what the purpose of the plot really is. We're not really describing any of our subjects and it's not clear that we're describing a hypothetical (but plausible) subject either. You would want to remember that you're assuming that the hazard changes instantaneously, not only that the weight changes instantaneously. I'm no expert on human physiology, but a sudden weight loss probably entails other side-effects that are not appropriately modelled.

This is simulated data, but one should also keep in mind that the weight covariate is time-dependent, especially since we're also modelling young people and children. Treating it as time-independent is probably a bad idea. Also, the heavy people will be the ones that entered to study as adults as weight is measured at entry. The OP seems to be aware of this, though, but I thought I'd mention it anyway.

Cox Model – How to Compute Partial Log-Likelihood Function

This is technically a programming question with an easy programming answer. If you simply want the partial likelihood, why not fool R into giving it to you? Simply initialize beta and allow no iterations, then extract the loglik value from the coxph object. (see ?coxph.object).

For example:

## artificial data
library(survival)
n <- 1000
t <- rexp(100)
c <- rbinom(100, 1, .2) ## censoring indicator (independent process)
x <- rbinom(100, 1, exp(-t)) ## some arbitrary relationship btn x and t
betamax <- coxph(Surv(t, c) ~ x)
beta1 <- coxph(Surv(t, c) ~ x, init = c(1), control=list('iter.max'=0))

With example output:

> betamax$loglik
[1] -68.62548 -65.99652
> beta1$loglik
[1] -66.10908 -66.10908

You can even define a wrapper:

loglik <- function(beta, formula) {
  formula, init=beta, control=list('iter.max'=0))$loglik[2]
}

betas <- seq(0, 2, by=0.01)
logliks <- sapply(betas, loglik, Surv(t, c) ~ x)
plot(betas, logliks)
abline(v=betamax$coefficients)

Best Answer

Related Solutions

Cox Model – Extrapolating Effect of Covariable Changes in Cox Proportional Hazards Models

Cox Model – How to Compute Partial Log-Likelihood Function

Related Question