Solved – Trying to understand formula for the Survival Function (survival analysis)

probabilitysurvival

I'm trying to learn the Cox Proportional Hazards Model on my own, and found this link that describes it in clear terms. But when I get to Formula (5) ($S(t) = \exp(−H(t))$) I can't figure out where that's coming from. On the author's previous page, he shows that the survival function equals $S(t) = \exp(−H(t))$ if we assume an exponential distribution, but in Cox we don't assume that.

Is $S(t) = \exp(−H(t))$ something that works for any hazard distribution? I can't think of a way to prove/disprove this, and the intuition isn't making sense for me.

Best Answer

All of these terms are standard in actuarial science and all of them apply to all distributions (but when I have seen these terms in studying for exams, we're almost always talking about distributions that are defined only for nonnegative reals). $H(t)$ is the cumulative hazard function, and for any distribution is defined as $$H(t) = \int_0^t h(x) \,dx.$$ Notice the name makes perfect sense with this definition, since we are "adding" up the hazard function up to a certain point to get the cumulative hazard function. Now, since $$f(t) = F'(t) = -S'(t)$$ then we have $$h(t) = \frac{f(t)}{S(t)} = \frac{-S'(t)}{S(t)} = -\frac{d}{dt} (\ln S(t)).$$ Finally, that means we have $$H(t) = \int_0^t -\frac{d}{dx} (\ln S(x)) \,dx = -\ln S(t)$$ since $S(0)$ is usually required to be 1 and thus $\ln S(0) = 0$.