Survival Analysis – Comparing Median Survival-Times to Cox Regression When Proportional Hazard Assumption Fails

cox-modelsurvival

I've used Cox regression to assess the risk/association of exposure with an event. My Cox analysis suggests that subjects exposed have an increased risk of the event happening compared to subjects not exposed. Finding that the exposure violated the proportional hazards assumption meant I implemented a time-dependent exposure covariate. Even after satisfying the proportional hazards assumption, HR didn't change much. The analysis continued to suggest subjects exposed have an increased risk of the event.

A colleague took my data and found the median survival time for those not exposed was shorter (only by a small amount) than those exposed. On the surface, this appears opposite to the Cox results. I think a comparison of median survival times was incorrect. Firstly, a median over the raw data does not consider subjects who died, left the study, or were lost to follow-up. These factors can be handled using a Kaplan-Meier curve, treating them as right-censored. However, a KM curve assumes the covariate effect is constant over time. Therefore, KM curves aren't helpful when dealing with time-dependent data.

I'm dealing with the top brass, the big wigs. Presenting Cox data is often met with reluctant acceptance (it's not intuitive at a moment's glance). My colleague hoped to find a duration-based (e.g., median time to event) approach that was more "illustrative" than Cox regression. I've explained that a median or even a KM curve would be inappropriate due to the time dependency in the data.

Are there any techniques I could use to graphically illustrate (like a KM curve) the survival times whilst matching well with the Cox results?

Best Answer

If your 2 groups are clearly differentiated by exposure versus no-exposure at the start of the study, individuals don't change exposure group thereafter, and there are no other covariates in your model, then a Kaplan-Meier analysis is OK. It doesn't assume that the effect of group membership is constant over time, just that the groups have distinguishing characteristics and that group membership doesn't change over time. You can readily get crossing Kaplan-Meier survival curves for 2 groups, with one group having better survival at early times and worse survival at later times. Of course, in that situation there aren't proportional hazards (PH) for group membership.

Comparing median survival times can be tricky. This answer outlines the problems, with a link to further reading. For example, with crossing survival curves you could get the same median survival despite major overall differences in survival patterns between the groups.

If your model involves additional covariates, the raw Kaplan-Meier curves might be misleading. You then are better served by illustrating model predictions for survival over time between the two groups, for covariate values representative of the population of interest.

I think part of your problem is in the way that you are thinking about the "time-dependent exposure covariate" that you invoked to deal with the violation of proportional hazards. That's typically done in the service of estimating a time-varying regression coefficient for the underlying exposure covariate (see Section 4.2 of the R time dependence vignette). You will get into trouble if you think of that as a time-dependent variable in any real sense; it's a construct to let you apply the simple Cox event-time-by-event-time analysis when the association of the underlying covariate with outcome changes over time. (I'm assuming that your subjects either were exposed or weren't exposed at the start of the study; if exposure changes over time then you do have to model exposure itself as a time-varying covariate.)

Also, if you have a very large number of events, it's very easy to get a "statistically significant" violation of PH that doesn't matter in practice. If PH is violated, the Cox regression coefficient provides a type of event-averaged hazard ratio that can still be informative.

My suggestions:

If the exposure groups are clearly distinguished at time = 0, their membership doesn't change thereafter, and there aren't covariates in your model, show the Kaplan-Meier curves (which take censoring into account). To demonstrate "statistical significance" of the difference in survival, cite a "log-rank" test, which in this case is just the "score" test reported for the original Cox model (whether PH holds or not).

If there are covariates in your Cox model, show differences in model predictions of survival over time by exposure group at otherwise the same representative covariate values. Explain that as "controlling for" those covariates to highlight the specific exposure-associated differences. The fairest representation would be from the model using the time-varying coefficient. Express "statistical significance" in terms of that of the coefficient(s) associated with exposure.