R Cox Model – How to Interpret Contradictory Schoenfeld Test Results and Plots?

cox-modelrschoenfeld-residuals

I have assembled a multivariable Cox proportional hazard model like this:

res.cox <-
  coxph(Surv(time, status) ~ 
          AGE +
          B + 
          C, 
        data = data)

When I use cox.zph(res.cox) I get a violation for proportional hazard assumptions for the first two variables (and a global violation):

                                 chisq df       p
AGE                           1.41e+01  1 0.00017
B                             2.10e+02  1 < 2e-16
C                             2.99e-03  1 0.95638
GLOBAL                        2.15e+02  3 < 2e-16

But when I draw graphs of the scaled Schoenfeld residuals against the transformed time using ggcoxzph(cox.zph(res.cox))[1], the CI overlaps 0 all the way (first variable for impression, its the same for the others):
enter image description here

It seems to me that this contradicts every issue and article I found:

What am I doing wrong?

BTW, this is the summary for res.cox:

  n= 2521, number of events= 2424 

                                   coef exp(coef)  se(coef)       z Pr(>|z|)    
AGE                            0.005123  1.005136  0.001574   3.254  0.00114 ** 
B                              -0.677403  0.507935  0.039421 -17.184  < 2e-16 ***
B                              -0.122768  0.884469  0.051108  -2.402  0.01630 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

                              exp(coef) exp(-coef) lower .95 upper .95
AGE                              1.0051     0.9949    1.0020    1.0082
B                                0.5079     1.9688    0.4702    0.5487
C                                0.8845     1.1306    0.8002    0.9777

Concordance= 0.648  (se = 0.007 )
Likelihood ratio test= 443.2  on 3 df,   p=<2e-16
Wald test            = 315.5  on 3 df,   p=<2e-16
Score (logrank) test = 304.6  on 3 df,   p=<2e-16

UPDATED 14/2/2023:

As @EdM kindly mentioned, the problem was a bug in the 'survminer' package. Luckily, the repo has now been updated, and the PR with the solution was merged.

Example

plot(cox.zph(res.cox), var='AGE') shows:
enter image description here
and ggcoxzph(cox.zph(res.cox))[1] shows:
enter image description here

The CRAN package has yet to be updated, you can install the latest version from GitHub as mentioned in the repo:

if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/survminer", build_vignettes = FALSE)
library("survminer")

Best Answer

The main thing you are doing wrong is using a software package that might still have a long-standing coding error. That error makes its Schoenfeld residual plots essentially useless. See this page for an explanation and a way to mend the code, if you want to stay with that package. I generally stick with the standard plotting provided by the survival package.

The other issue is your very large data set with over 2000 events. With that many events you can easily have "statistically significant" violations of proportional hazards that are of little practical significance. The "practical significance" is a judgment call that will be easier to make once you have a useful plot.

Also, an improperly modeled continuous predictor can show proportional hazards violations that are fixed when the predictor is modeled appropriately (e.g., with a regression spline). That could be what's going on with AGE, and with B if B is continuous.