Solved – Why does the proof for showing that the Kaplan-Meier estimate is unbiased not work

kaplan-meierprobabilitysurvival

I know that the Kaplan-Meier estimator is biased because my textbook says so. However, I don't understand why the following proof doesn't work:

Let $\hat{S}(t)$ be the Kaplan-Meier estimate for the survival function $S(t)\equiv P(T_i > t)$ where $T_i$ are iid failure times. Let $\hat{\Lambda}(u)$ be the Nelson-Aalen estimator for the cumulative hazard function $\Lambda(u)$.

It is known that $\frac{\hat{S}(t)}{S(t)}-1 = -\int\limits_{0}^{t}\frac{\hat{S}(u^-)}{S(u)}d\{\hat{\Lambda}(u)-\Lambda(u)\}$.

Now, $\int\limits_{0}^{t}\frac{\hat{S}(u^-)}{S(u)}d\{\hat{\Lambda}(u)-\Lambda(u)\}$ is a martingale because $\hat{\Lambda}(u)-\Lambda(u)$ is a martingale and because $\frac{\hat{S}(u^-)}{S(u)}$ is a predictable process.

So, $\mathbb{E}[\frac{\hat{S}(t)}{S(t)}-1]=0 \Rightarrow \mathbb{E}[\frac{\hat{S}(t)}{S(t)}]=1 \Rightarrow \mathbb{E}[\hat{S}(t)]={S(t)}$ since $S(t)$ is a non-stochastic function.

Best Answer

The flaw in your argument is that $\hat\Lambda(t)-\Lambda(t)$ is not a martingale for all times $t$. It is only a martingale up to the time $T$ when the experiment ends, that is, when the last survivor either dies or becomes censored (i.e. drops out of the study). After that, $\Lambda(t)$ continues to increase but $\hat\Lambda(t)$ does not.

Now, if the last survivor in the sample actually dies, then at this point it doesn't matter that $\hat\Lambda(t)-\Lambda(t)$ stops being a martingale, because $\hat S(t)=0$ and therefore $\hat S(t)/S(t)$ will continue to be a martingale. So without censored data, the K-M estimator is in fact unbiased (it is pretty easy to show this directly without stochastic processes).

However, if censoring exists, this raises the possibility that the last remaining survivor will become censored rather than die. In this case $\hat S(t)$ will never drop to zero, and so $\hat S(t)/S(t)$ will cease to be a martingale at time $T$. This possibility -- that the last survivor becomes censored -- is the source of bias in the K-M estimator.

Related Question