Survival Analysis Exponential Distribution – Why Are Survival Times Assumed to Be Exponentially Distributed?

assumptionsdistributionsexponential distributionsurvival

I am learning survival analysis from this post on UCLA IDRE and got tripped up at section 1.2.1. The tutorial says:

… if the survival times were known to be exponentially distributed, then the probability of observing a survival time …

Why are survival times assumed to be exponentially distributed? It seems very unnatural to me.

Why not normally distributed? Say suppose we are investigating some creature's life span under certain condition (say number of days), should it be more centered around some number with some variance (say 100 days with variance 3 days)?

If we want time to be strictly positive, why not make normal distribution with higher mean and very small variance (will have almost no chance to get negative number.)?

Best Answer

Exponential distributions are often used to model survival times because they are the simplest distributions that can be used to characterize survival / reliability data. This is because they are memoryless, and thus the hazard function is constant w/r/t time, which makes analysis very simple. This kind of assumption may be valid, for example, for some kinds of electronic components like high-quality integrated circuits. I'm sure you can think of more examples where the effect of time on hazard can safely be assumed to be negligible.

However, you are correct to observe that this would not be an appropriate assumption to make in many cases. Normal distributions can be alright in some situations, though obviously negative survival times are meaningless. For this reason, lognormal distributions are often considered. Other common choices include Weibull, Smallest Extreme Value, Largest Extreme Value, Log-logistic, etc. A sensible choice for model would be informed by subject-area experience and probability plotting. You can also, of course, consider non-parametric modeling.

A good reference for classical parametric modeling in survival analysis is: William Q. Meeker and Luis A. Escobar (1998). Statistical Methods for Reliability Data, Wiley

Related Question