Solved – Odds ratio of a continuous variable (univariate cox proportional hazards), how to plot that variable against death%

continuous datacox-modeldata visualization

I have a continuous variable X with which i ran Cox proportional hazards. The outcome was 1=Death, 0=censored/still alive. I have an odds ratio 0.516 for this predictor.

1) How do i interpret that hazard ratio? For every unit increase in X there is .516 more times chance of dying or living?

2) How can I plot a graph of the survival function (or death rate), with Variable X in the x-axis? (please state specific procedures/software that can do that).

EDIT: I found out S(t) = S0(t)exp(γ) ,
γ = -0.441χ .

χ = the continuous variable value (a biological marker, so χ=0 makes no sense).

The only problem is how to estimate the S0. Which χ shall I use? the mean? the mode? the median?

Thank you.

Best Answer

When you run a Cox proportional hazards analysis you are estimating the hazard ratio for covariates without actually estimating the baseline survival function. When you come to actually estimate the survival function for a specific value of covariates you do need to then estimate the survival function and if you have a good amount of data (i.e., lots of people at the start and lots of them followed up to death) then the Kaplan-Meier survival function adjusted to covariates at value 0 is usually the best baseline.

This raises two questions:

  1. How does one adjust the Kaplan-Meier survival function to covariates value 0?
  2. What if the value 0 is not meaningful for a covariate?

For the first point I could point you to http://www.stata.com/manuals13/st.pdf pages 195-196 or http://data.princeton.edu/pop509/NonParametricSurvival.pdf . The danger with using the baseline cumulative hazard function as you suggest is that for discrete failure time (i.e., Kaplan-Meier) it does not hold that $S_0(t)=\exp\{-H_0(t)\}$ (although they are close if there are not many failures/deaths).

I see from the URL you posted in your comment that medcalc is able to produce an estimate of the survival function for mean covariates - you can use this in the following manner:

\begin{align} S_X(t) &= S_{\bar{X}}(t)^{\exp\{Y\}} \\ Y &= -0.441 (X_1-\bar{X}_1) \end{align}

Where $S_{\bar{X}}(t)$ is the survival function for mean covariates, $X_1$ is the covariate you will vary on the x-axis, and $\bar{X}_1$ is the mean value of that covariate. You do not need to do any adjustment for the other covariates because they are already assumed to take their mean value.

For the second point it is not uncommon for a value of 0 to be not meaningful (e.g., the covariate could be weight or age of adults). If a linear relationship between the covariate and hazard is established then it will not matter if you estimate the baseline with an unmeaningful value (provided you don't present it as representative of anything). You may consider running some proportional hazards tests on your covariate to see whether a linear relationship is reasonable. If it is not then you could consider instead replacing $X$ with $\ln X$ in your regression (which will lead to a baseline estimation at $\ln X = 0 \Leftrightarrow X = 1$ and will ensure $X>0$).

Hope that helps!