Breslow or similar estimators in Cox model

survival

I read that to assess the assumption of proportional hazards, you can compare the curves of the $ln(-ln(S(t)))$. However, the whole point of the Cox proportional hazards model is that you don't have to estimate the survival function or baseline hazard.

That said, it seems like the idea is to estimate the baseline hazard function with the Breslow estimator.

What I don't understand is why you can't just use the Kaplan-Meier estimator for the baseline hazard. And if you can, then does that mean you can use the Breslow method for a nonparametric estimator of the survival function in place of Kaplan-Meier? I'm just really confused about what the connection is between the Breslow estimator and the Cox model basically. I don't see why you need a "special" method in the Cox situation.

Or maybe I'm assuming baseline hazard (Cox) gives the same survival function as what you estimate with Kaplan-Meier in the absence of covariates, but this should be true when, for example, a Cox model has one binary covariate. Estimating the baseline hazard (covariate=0) would be the same as estimating the Kaplan-Meier curve for the covariate = 0 group, would it not?

Best Answer

Estimating the baseline hazard (covariate=0) would be the same as estimating the Kaplan-Meier curve for the covariate = 0 group, would it not?

No, and I think that this is the source of your general confusion.

With a Cox proportional hazards model, you implicitly assume that there is a single baseline cumulative hazard $H_0(t)$ that applies to all cases, with the covariates $z$ and their regression coefficients $\beta$ leading to a covariate-specific hazard $H(t;z) = H_0(t)e^{\beta z}$. As explained for example on this page, the Breslow-Aalen estimate of the baseline hazard is made only after you have fit the Cox model and gotten the regression coefficients.

Thus a Cox model forces all cases to have hazards over time of the same general shape, with their steepness determined by the covariate values and the associated hazard ratios.

So even with a binary predictor in a Cox model, the baseline hazard will not coincide with the corresponding non-parametric estimator* for the covariate = 0 group. It will be some sort of compromise in the shapes of the curves, and it will show steps down at all event times, not just at the event times of the covariate = 0 group.


*Note that the Breslow and Kaplan-Meier non-parametric survival-cure estimates are determined differently. See a survival analysis text like Therneau and Grambsch, or this web page.

Related Question