Solved – How to do model selection for Cox Proportional Hazards Model when stratifying

cox-modelmodel selectionsurvival

I am performing backward selection for the Cox Proportional Hazards model with several variables (it is required that I do backward selection). One of these variables (let's call it "A") does not meet the proportionality assumption in the full model. So I would like to consider the possibility of stratifying on variable A. How should I perform backward selection in this scenario?

I am considering performing backward selection on the full model while stratifying on A (call the model selected "model B"). I will then compare fit statistics between model B with stratification and model B without stratification. If I see that the stratification is providing only a marginally better fit, then I will perform another backward selection on model B without stratification.

Two questions:

  1. Does this seem reasonable or is there a better way?
  2. How do I formally test if model B without stratification is better than model B with stratification? A likelihood ratio test would not work because the models are not nested and comparing AIC or BIC doesn't make sense because there are the same number of variables in each model.

Edit: any general advice on model selection in COX PH is appreciated as well!

Best Answer

Do not perform backward selection. Nor forward. Nor stepwise. Don't do it for Cox or OLS or logistic or any other model. Instead, think about what variables you have, why you have them, what they mean, what theory says about them, how including them affects other variables and so on.

If you must use some automated method, LASSO or LAR have nice properties.

But no automated method is as good as your brain and your knowledge.

If a variable doesn't meet one of the assumptions, consider modifying the variable; or consider another method of analysis. Also, with a large N, the test of proportionality may be overly conservative. You also might consider interactions.