Cox Regression – Interpretation for Two-Covariates Interaction


I am running a Cox PH model for evaluating factor associated with mortality in my dataset.

I have two covariates – one is age and one is a parameter which can vary between 0 (better) and 1 (worst).

When I run the model, I found a negative interaction between these two terms. However, I produced a graph for which I found some difficult in the interpretation (please find below):

enter image description here

The thing that I found difficult to interpret is that while I expect some modulation of the age effect by the X variable, I cannot explain myself how a patient of 100 years old with X=1 can have a massive lower HR than a younger patient with X=1.

Am I missing something in the interpretation of this?

Thank you in advance.

Best Answer

The question you have to ask and answer is whether you have modeled the effects of age and X and their interaction correctly. Your model seems to have treated each as linearly associated with log-hazard, and with an interaction term, the product of those 2 predictors, that is also linearly related to differences in log-hazard. If that's not the case then your predictions from the model will be unrealistic, particularly at extremes of predictor values. They might instead be associated with outcome in ways that are better captured by splines.

Besides doing the standard tests for evaluating the linearity and proportional hazards assumptions, there are a couple of things you could look at.

For one, there's something interesting happening at an X value of about 0.45. Below that value, higher age is associated with higher HR as expected. Above that value, the association of age with HR is reversed. Examine what's going on there. Does that type of behavior make sense based on your understanding of the subject matter?

For another, look at the actual joint distribution of age and X values in your data set. Do you actually have cases with low age and high X values? Do you have cases with age values as high as 100? It's possible that you are asking the model to make predictions about combinations of predictor values that don't really exist.

Related Question