Survival Analysis – How to Interpret CoxPH Survival Regression

cox-modelpythonrsurvival

I have a time to event data. Where 20% of the events are observed, so, 80% of the events are censored. Using this data I developed a CoxPH model using python lifelines. For this data I got model summary like below

Iteration 1: norm_delta = 0.71969, step_size = 0.95000, ll = -11597.41595, newton_decrement = 487.49971, seconds_since_start = 0.2
Iteration 2: norm_delta = 0.41603, step_size = 0.95000, ll = -11042.74025, newton_decrement = 74.48468, seconds_since_start = 0.5
Iteration 3: norm_delta = 0.19189, step_size = 0.95000, ll = -10958.25644, newton_decrement = 8.61687, seconds_since_start = 0.8
Iteration 4: norm_delta = 0.03981, step_size = 0.95000, ll = -10949.05062, newton_decrement = 0.28285, seconds_since_start = 1.0
Iteration 5: norm_delta = 0.00311, step_size = 0.95000, ll = -10948.76412, newton_decrement = 0.00164, seconds_since_start = 1.2
Iteration 6: norm_delta = 0.00016, step_size = 0.95000, ll = -10948.76249, newton_decrement = 0.00000, seconds_since_start = 1.5
Convergence completed after 6 iterations.
<lifelines.CoxPHFitter: fitted with 7350 observations, 5998 censored>
      duration col = T
         event col = label
number of subjects = 7350
  number of events = 1352
    log-likelihood = -10948.762
  time fit was run = 2018-12-17 10:48:35 UTC

---
         coef  exp(coef)  se(coef)        z      p  lower 0.95  upper 0.95     
AGE   -0.0001     0.9999    0.0000 -10.6696 0.0000     -0.0001     -0.0001  ***
GRADE -1.0068     0.3654    0.0450 -22.3973 0.0000     -1.0949     -0.9187  ***
---
Signif. codes: 0 '***' 0.0001 '**' 0.001 '*' 0.01 '.' 0.05 ' ' 1

Concordance = 0.797
Likelihood ratio test = 1297.307 on 2 df, p=0.00000

My objective is using this model I want to predict When the event/action is going to happen for a new sample?

I used this method predict_expectation()

As this docs described,

Compute the expected lifetime, E[T], using covarites X.

using the above method I'm getting larger lifetime than expected, for most of the cases. For Example, instead of ~180 days it gives ~1200.

I don't understand Why I'm getting this huge values, even though my concordance value is quite descent.

Am I going into the right direction?
Can I predict When the event/action is going to happen for a new sample using coxPH?

Best Answer

Your problem is probably that, either by default or implicitly, you have obtained predictions out of the range of normal covariate values. Defaults for these kind of predictions are typically to use the survival curve obtained from the baseline hazard function, which would be the predicted survival for a subject with age 0 and grade 0.

Another possible explanation is that you are predicting survival in a group with 100% censoring rate. If you sample kids from newborn status, the majority of them are likely censored as per your description of the dataset, so the model is doing some stupid guessing as to how long they survive, because you asked the model to do so. The only reason the model obtains a prediction for their expected survival is that it borrows on the assumption of a proportional hazard to obtain non-singular hazard ratios for this group and a survival function which descends very little from 100% over the duration of the study.

You haven't given enough detail to confirm or disprove this explanation.