Survival Analysis – Is the Effect in a Cox Proportional Hazard Collapsible if the Covariates are Normally Distributed and the Baseline Hazard is Constant?

conditional-expectationcox-modelmarginal-effectproportional-hazardssurvival

Since the Cox PH model is a non-linear model, we would expect the effect to be non-collapsible. i.e., the marginal and conditional effects differ.

I did some calculation for a setting where the baseline hazard is constant, and the covariate is normally distributed and treatment is binary. In this case, would the effect actually be collapsible? See the calculation below.

Assume a treatment Z is binary (0, 1), a covariate x has standard normal distribution N(0, 1), and the baseline hazard is exponential with constant hazard $\lambda_0$, the conditional effect is $\beta$.
Suppose the hazard function is $\lambda=\lambda_0 exp(X+\beta Z)$.

The marginal hazard for $Z=1$ would be: $$\int_{-\infty}^\infty 1/{\sqrt(2\pi)}\exp(x+\beta)\exp(-1/2 x^2) dx = \exp(\beta+0.5) $$
The marginal hazard for $Z=0$ would be: $$\int_{-\infty}^\infty 1/{\sqrt(2\pi)}\exp(x)\exp(-1/2 x^2) = \exp(0.5)$$
The ratio of the two would be $\exp(\beta)$, which is the same as the conditional hazard ratio.
Not sure if my understanding is correct.

Best Answer

Maths is hard, so I would always start by checking this sort of thing with simulation

> set.seed(2024-2-5)
> x<-rnorm(10000)
> z<-rep(0:1,5000)
> y<-rexp(10000,1/exp(-x-z))
>  
> library(survival)
> coxph(Surv(y)~x+z)
Call:
coxph(formula = Surv(y) ~ x + z)

     coef exp(coef) se(coef)     z      p
x 0.99488   2.70440  0.01299 76.57 <2e-16
z 1.02003   2.77327  0.02177 46.84 <2e-16

Likelihood ratio test=7219  on 2 df, p=< 2.2e-16
n= 10000, number of events= 10000 
> coxph(Surv(y)~x)
Call:
coxph(formula = Surv(y) ~ x)

     coef exp(coef) se(coef)     z      p
x 0.85830   2.35915  0.01233 69.62 <2e-16

Likelihood ratio test=5054  on 1 df, p=< 2.2e-16
n= 10000, number of events= 10000 
> coxph(Surv(y)~z)
Call:
coxph(formula = Surv(y) ~ z)

    coef exp(coef) se(coef)     z      p
z 0.6440    1.9041   0.0206 31.26 <2e-16

Likelihood ratio test=965.7  on 1 df, p=< 2.2e-16
n= 10000, number of events= 10000 

It doesn't seem to be collapsible. So what's wrong with the maths?

Well, here are smooth estimates of the hazard rate for the two $Z$ groups enter image description here

They aren't constant. The marginal hazard decreases with time (for any mixture of exponentials), so your calculations can't be right.

What I think you've calculated is $E[\exp(X+\beta Z)|Z=1]$ and $E[\exp(X+\beta Z)|Z=0]$. Those aren't the marginal hazard (except at time zero) because they don't account for how the population at risk changes over time.

In fact, the hazard ratio at time zero, estimated from the smoother, is very close to exp(1), but it decreases over time.

Related Question