Solved – Violation Proportionality Cox model – Repeat analysis


I'm new to survival analysis, but I've been reading some papers and books and I got a nice model.

However, one of the variables (Sit) does not met the proportionality assumption for the Cox model. Nevertheless, this got me thinking and it makes sense, since that variable should have a time-varying effect. For instance, in this case, I expect that shortly after treatment, the risk for relapse is higher for some individuals while as time passes, the time for all individuals is similar. Note (to myself) that time-varying effect is different than time-varying covariate. It got me quite some time to fully understand this.

Now, my question is:

I know how to include an interaction between Sit and time (I'm using R). But when I add the interaction some of the variables that were previously significant are not anymore. And it got me thinking, should I repeat the process to select variables? Perhaps, some of the variables previously removed are now significant. Or should I interpret the model as it is, and ignore the variables that are no longer significant?

Another question is:

The proportionality of the Cox model is violated. I have decided to introduce an interaction. So, now I should not do the residuals analysis? Since the proportionality is no longer in order. How should I test for the fitness of the model, then?

I'm really confused about this and it would be great to have some guidance and tips about the best procedure, from now on.

Thank you very much!

Best Answer

Though I'm not an expert in survival analysis, I put here my suggestions and hope they will be helpful.

First of all, selection of variables looking at their p-values is a wrong way, especially when the model is aimed to make statistical inferences. You can read about that in multiple sources searching for "stepwise regression drawbacks". The selection of variables should be based on your domain-specific knowledge. All variables which are relevant (on your opinion) should be present, no matter whether their influence is significant or not. In such way you will report the effect of Sit adjusted for the list of used variables, and that is right. It seems that your research is exploratory but not confirmatory. In such case while interpreting the results, you'd better make emphasis on the sizes of effect (model coefficients, odds ratios or risk ratios) rather then on p-values.

As for violation of proportionality assumption: taking into consideration the interaction between Sit and time, you are incorporating linear dependence of Sit on time into the model. So if the true relationship between Sit and time is really close to linear, then proportionality assumption will be held. Thus all model diagnostics methods remains relevant.