[Math] Elementary proof of geometric / negative binomial distribution in birth-death processes

markov chainsmarkov-processprobabilitystochastic-processes

The birth-death process concerns a population of $n_0$ individuals, each of which reproduces and dies at a constant rate as time $t$ increases from $t=0$. Each individual splits into two individuals (birth) with rate $\lambda$, and each individual dies at rate $\mu$. One can conceptualize this process as a continuous-time Markov chain with states $0, 1, 2,\ldots$ representing the current size of the population. We are interested in the distribution $P_t(n) = P(N(t) = n)$ of the population size $N(t)$ at time $t$.

It can be shown that for the pure birth process ($\mu = 0$), at fixed time $t$ the probability $P_t(n)$ has the form of a negative binomial distribution:
$$
P_t(n) = {n-1 \choose n-n_0} (e^{- \lambda t})^{n_0} (1 – e^{-\lambda t})^{n – n_0}.
$$

A similar negative binomial formula can be derived for a nonzero death rate. This is typically proven by solving a PDE involving the generating function governing the process; see for example these slides or (Grimmett and Stirzaker, section 6.11).

It would be interesting to prove the formula in a more elementary fashion, by identifying how exactly the birth-death process corresponds with the process that defines the binomial distribution. This is similar to how combinatorialists want a bijective proof of an enumeration theorem after they find a generating function proof. I have not been able to find the desired elementary proof, even for the special case of a pure birth process starting from 1 individual ($n_0 = 1$ and $\mu = 0$), in which case the negative binomial simplifies to the geometric distribution. Does anyone know where to find it, or how to derive it from first principles?

Here's what I've got so far: recall that the negative binomial distribution is the distribution of "the number of successes in a sequence of Bernoulli trials before a specified number of failures occurs." In the case of the formula above, we can compare with the formula on wikipedia to see that the number of successes is $n – n_0$, the number of failures is $n_0$, the success probability on each Bernoulli trial is $1 – e^{\lambda t}$, and the failure probability is $e^{\lambda t}$. The failure probability is recognizable as the probability that an exponentially distributed waiting time (with rate $\lambda$) exceeds $t$. Thus, the formula seems to be telling us that the birth process can be simulated by throwing exponential waiting times onto the real line until $n_0$ of them have exceeded the threshold $t$. But it's unclear what realizations of the process are generated by this simulation method, and how it is able to automagically place the proper probabilistic weight on these realizations.

Best Answer

A little bit more details. What is the probability that in a time interval $(0,t)$ exactly one person will have a success. Well first pick the person (say P1) that it will be call their child P2 and everyone else E, and condition on exactly when the success happens. Call that pdf of the first birth time of P1 $p_1(s)$.

$$P(\text{P1 has 1 birth & P2 and E have no births})$$ $$ = \int_0^t P(\text{P1 and P2 and E no birth after s}|\text{P1 has birth at time s})p_1(s)\mathrm{d}s$$ $$= e^{-\lambda t (n_0 - 1)}\int_{0}^t (e^{-\lambda (t-s)})^2 \lambda e^{-\lambda(s)} \mathrm{d}s$$ $$= e^{-\lambda t (n_0 - 1)}e^{-2 \lambda t} \left(e^{\lambda t}-1\right)$$ $$= e^{- \lambda n_0 t} \left(1-e^{-\lambda t}\right)$$

Then since we can make $n_0$ choices for the parent we get

$$P(\text{One birth in time interval (0,t)}) = n_0 e^{- \lambda t n_0} \left(1-e^{-\lambda t}\right)$$

Suppose that starting with $n_0$ the formula for $k$ births in the time interval is

$$P_t^{(n_0)}(k) = {n_0+k-1 \choose k} (e^{- \lambda t})^{n_0} (1 - e^{-\lambda t})^{k}$$

We wish to compute the formula for $P_t^{(n_0)}(k+1)$ by conditioning on when the first birth happens. It's a pretty elementary probability problem to figure out the PDF for the first of $n_0$ exponential RVs is $$p(s) = \lambda n_0 e^{-\lambda n_0 s}$$

So we get

$$P_t^{(n_0)}(k+1) = \int_0^t\lambda n_0 \binom{k+n_0}{k} \left(1-e^{-\lambda (t-s)}\right)^k e^{-\lambda n_0 s-\lambda (n_0+1) (t-s)}\mathrm{d}s$$ $$= \frac{n_0 \binom{k+n_0}{k} \left(1-e^{-\lambda t}\right)^{k+1} e^{-\lambda n_0 t}}{k+1}$$ $$= \binom{(k+1)+n_0-1}{k+1} \left(1-e^{-\lambda t}\right)^{k+1} e^{-\lambda n_0 t}$$

Pretty easy to see that if we convert back to your notation using just the total number of people $n$ instead of the number gained $k$, that this gives the right thing.

Related Question