Cox Model – Addressing Violation of Proportional Hazard Assumption and Interaction with Time

cox-modelproportional-hazardsschoenfeld-residuals

I will try to keep my question as short as possible.

For my thesis I am researching if a risk score can predict graft failure in a cohort of $596$ patients over the course of $10$ years. (The variable is not time-varying)

I want to do a Cox regression, however the Shoenfeld residuals test is significant ($0.038$). Which means that the proportional hazard assumption has been violated.

I have tried to solve this by adding an interaction term with the log of time as shown below:

.stcox t_risk10_perc risk10_perc 

t_risk10_perc = log(time variable of follow-up) * risk10_perc

Cox regression results.

My questions are:

  • Am I doing this correctly?
  • Should I look at the t_risk10_perc or at the risk10_perc hazard ratio?

Best Answer

Unless the syntax of your software differs substantially from that of the coxph function in the R survival package, then your approach is not correct. You are, however, in very extensive company in trying to fix a proportional-hazard issue this way. A simple modification can correctly accomplish what you desire, at least with coxph.

As I understand your code, your definition

t_risk10_perc = log(time variable of follow-up) * risk10_perc

simply multiplies, for each case, the value of the covariate risk10_perc by the survival/censoring time for that case. As the vignette on "Using Time Dependent Covariates" in the R survival package puts it:

This mistake has been made often enough th[at] the coxph routine has been updated to print an error message for such attempts. The issue is that the above code does not actually create a time dependent covariate, rather it creates a time-static value for each subject based on their value for the covariate time; no differently than if we had constructed the variable outside of a coxph call. This variable most definitely breaks the rule about not looking into the future, and one would quickly find the circularity: large values of time appear to predict long survival because long survival leads to large values for time.

As explained in the vignette, the survival package allows for a time-transform functionality, with which you can define an arbitrary function of continuous time (not just of the single observed event/censoring time) and covariate values to accomplish this type of analysis. This will provide estimates of coefficient values both for the covariate and for your function of time, both of which you will need to interpret appropriately. You will have to check your software to see if it provides a similar time-transform functionality.

Related Question