R Survival Analysis – Interpretation of coxph: rsq, cox.zhp, Robustness, and Concordance

cox-modellikelihoodrsurvival

I would like to get explained the following situations in coxph (library(survival))

  1. The Concordance, what is a good Concordance?
  2. What does it mean robust= 2.61 p=0.1 in a logrank p=0.006 ?
  3. what is a good rsq? (summary(coxph.model) $ rsq
  4. How to interpret cox.zph (coxph.model) – the output has, for every variable in the multivariable coxph.model a chisq, df and p value. Also, in addition to the variables in the multivariable model, there is a GLOBAL value.

This would be a potential output (I've tricked the name of the variables due to confidenciality) – I think it would be helpful to have the interpretation of this particular output as a guidance, if possible.

enter image description here

thank you!

Best Answer

Point 1. Quoting from Section 20.10 of Frank Harrell's Regression Modeling Strategies:

The c index [concordance] is the proportion of all pairs of subjects whose survival time can be ordered such that the subject with the higher predicted survival is the one who survived longer.

So a concordance of 0.5 is what you get if a model can't distinguish survival times at all. What's "good" above that depends on the nature of the study.

Point 2. Standard significance tests assume that observations are independent. Your use of an id variable indicates that some individuals (or groups of individuals with the same id value) contributed to multiple observations. Perhaps a single individual could experience more than 1 event. The "robust" standard error estimate takes that lack of independence into account, generally leading to wider confidence intervals and higher p-values. As the output from the summary says:

The likelihood ratio and score [logrank] tests assume independence of observations within a cluster, the Wald and robust score tests do not.

Point 3. See the answer to Point 1. What's good depends on the nature of the study. I find concordance and measures of model validation and calibration to be more useful than $R^2$ values. I strongly recommend learning to use the tools in Frank Harrell's rms package to evaluate model quality.

Point 4. This is covered at the end of Chapter 3 (Section 3.5.2, "Score tests") of the main survival vignette.

The cox.zph function checks proportional hazards for a fitted Cox model directly...

for individual predictors or for the model as a whole (GLOBAL). A low p-value indicates evidence against PH. In your case, it looks like age might violate PH. You might fix that by modeling age flexibly with a spline (e.g., rcs() in the rms package), as incorrect specification of the functional form of a continuous predictor can show up as an apparent violation of PH.