Solved – Statistical significance seen with Kaplan-Meier but not with Cox model

cox-modelinterpretationkaplan-meierstatistical significancesurvival

I am currently working on a retrospective analysis where I am looking at patients to see if immune sensitization prior to a procedure has any bearing on outcomes after the procedure. The immune sensitization is a categorical variable (Hi vs Lo).

In regards to outcomes, one of the things I am looking is survival analysis (time to death). I ran a Kaplan-Meier analysis broken down by immune sensitization and there seems to be a statistically significant (p<.05) difference in regards to the time to death.

My questions are these:

  1. I ran a Cox regression analysis with the sensitization status as the only covariate and noted the p-value to be .35. Why is there such a big difference in the statistical significance and which one should I choose? (if it helps, from a visual inspection, only patients in the Hi category died)
  2. When analyzing my patient characteristics, I noted that there was a gender difference in the two sensitization categories (p<.05). I ran a Cox Regression model with the sensitization and gender as the covariates and noted that p-value of 0.955 for the sensitization status and 0.5 for the gender. Am I correct in interpreting that neither covariate has any bearing on the time to death?

As requested, here is some of my output

Original Kaplan Meier analysis (PRA_Categ represents the sensitization status):

Original Kaplam Meier Results (PRA_Categ represents the sensitization status)

The Corresponding COX regression with only one covariate (PRA_Categ):

The corresponding COX regression with only one covariate (the PRA_Categ)

COX regression with two covariates (PRA_Categ and Gender):

COX regression with two covariates (PRA_Categ and Gender)

Best Answer

The answer is fairly straightforward in this case, but it is rather unhelpfully hidden in the SPSS output, and it requires that you understand something about how the parameters are estimated and tested 'under the hood'. Because the low sensitivity group has no deaths, there are problems with the way the parameters are estimated (by attempting to maximize the likelihood). You can see this in the comment below the Iteration History box:

b. At least one coefficient
    is tending to infinity after 3
    iterations

This means that the algorithm's best estimate of the true parameter is infinite, which makes the algorithm crash, and it causes problems with the Wald test of the estimate. It does not cause these problems with a score test of the estimate. That is why in the Omnibus test of model coefficients box, under "Overall (score)" you see a p-value of .024, which matches the log rank test in the analysis above. That is a perfectly valid p-value in your case and you should use it / the KM test (which is a score test).

(Note also that you see the same thing in your second analysis. In fact, the second predictor Gender never even ends up being entered into the model.)


Maximum likelihood estimation, and score, Wald and likelihood ratio tests of ML estimates are very technical topics. If you are new to statistics, I wouldn't suggest you pursue them. However, the following do exist for future readers: