Survival – Validity of Comparing Nested Models Despite Violated Proportional Hazards Assumption in Cox Models

cox-modellikelihood-ratiosurvival

I am trying to understand when are Cox models still informative and useful even when the proportional hazards (HR) assumption is violated and came across this interesting answer. It includes a link to this nice paper:

Stensrud MJ, HernĂ¡n MA. Why Test for Proportional Hazards? JAMA.
2020;323(14):1401-2.
https://jamanetwork.com/journals/jama/fullarticle/2763185

which in turn, references this paper:

Pak K, Uno H, Kim DH, Tian L, Kane RC, Takeuchi M, et al.
Interpretability of Cancer Clinical Trial Results Using Restricted
Mean Survival Time as an Alternative to the Hazard Ratio. JAMA Oncol.
2017;3(12):1692-6. https://pubmed.ncbi.nlm.nih.gov/28975263/

They discuss the issues of using the hazard ratio (HR) from a Cox model when the PH assumption is violated:

"The limitations concerning this summary measure have been discussed
extensively in the literature. The validity of using the HR depends on
the proportional hazards assumption, that is, the HR for 2 groups is
constant over the entire study period. This assumption is rarely valid
in practice and without this assumption, the resulting HR estimate is
difficult to interpret."

Does this suggest that although a HR may not be easily interpretable as something meaningful from the model, it does not mean that the model is totally invalid? Lets assume we have not checked if stratification or interactions between time and your time-invariant covariate of interest have been explored.

If we are comparing two nested Cox models and the PH assumption does not hold for one (or both) of the models, does this mean AIC, likelihood ratio tests etc. are completely useless? Here, we are not interested in a specific HR estimate which may be uninterpretable (and possibly incorrect) if the PH assumption does not hold but are instead just interested to know if the inclusion of a variable improves overall model fit. For example, we are just interested if the inclusion of a variable (resid.ds in this example) improves overall model fit.

#hypothetical R setup:
library(survival)
fit <- coxph(Surv(futime, fustat) ~ age + ecog.ps + resid.ds, data = ovarian) 
fit2 <- coxph(Surv(futime, fustat) ~ age + ecog.ps, data = ovarian)
anova(fit2, fit)

thanks

Best Answer

AIC can be invalid if there is severe violation of the PH assumption. For the bigger picture, it is rare that non-PH will lead to the wrong answer in determining the direction of a variable's effect, but it will lead to poor estimation of the magnitude of the effect.

Difference in mean restricted survival time are not the magic solutions that come claim it to be. That because unless the relative measure captured by the hazard ratio, the difference in mean survival time must be covariate-specific. Large differences can be seen for sick patients while less at-risk patients will have smaller differences.

A very rational approach that properly handles uncertainties would be to use a Bayesian Cox model where you put a prior distribution on the departure from PH. This may favor PH for small samples but relax that assumption as the number of events increases. For example you might have a time-dependent covariate formed from interacting a baseline covariate with log(t) and put a prior on that 'interaction'. Also see this important and often neglected paper which is a related idea but using an unpenalized frequentist attack.