Solved – Modelling proportional hazards in Cox Model using coxph in R

cox-modelrsurvival

Assume I have a heterogeneous sample with two categorical variables A and B, each with 2 levels. Now I want to measure the effect of these on the survival function.

So we assume proportional hazard, i.e. the survival function can be written as $S_0^{exp(\beta^T z)}$ for the covariates (categorical variables A, and B) and the corresponding coefficients $\beta$.

I solve this in coxph in R.

Most often, you would for these variables and levels, assume the model

$\beta_0 + \beta_A z_A + \beta_B z_B$, where $z_A=1$ if it belongs to the first level, and 0 if not. The variable $z_B$ is defined analogously.

Since the intercept may be "merged" with the baseline survival function $S_0$, this is not included in the model. So the model argument for the coxph function in R would be just $z_A$ and $z_B$. But then I can only measure the relative effect of $z_B$ vs $z_A$, isn't there anyway to obtain the intercept as well?

Instead, I tried to use the model $z_{A=0}$, $z_{A=1}$,$z_{B=0}$ and $z_{B=1}$. But this returns the error that the covariates matrix is singular. Which I don't clearly understand. I thought the model would be singular if, for instance, the intercept would be included as well (but now as I'm writing this, perhaps the intercept is indeed included as I mentioned), because then we could decrease $\beta_0$ and increase all other variables by the same amount and obtain the same prediction, so there's no unique solution for the system of equations.

Best Answer

In a semi-parametric Cox-PH model, there is no intercept.

To help illustrate this, consider if you had a subject with all 0 covariates. By definition, their survival distribution will be defined by the baseline survival function, i.e. $S_o(t)$. But if we included an intercept in our model, the survival probability at time $t$ would be equal to

$ S_o(t)^{exp( \beta_0)}$

Clearly, this only makes sense if $\beta_0 = 0$.

Related Question