I am trying to figure out the mean for negative binomial distribution but have run into mistakes. I know there are other posts on deriving the mean bu I am attempting to derive it in my own way. I wonder if any of you can point out where my mistake is:
In negative binomial distribution, the probability is:
$$
p(X=x) = \frac{(x-1)!}{(r-1)!(x-r)!}p^r(1-p)^{x-r},
$$
where $X$ is a random variable for the number of trials required, $x$ is the number of trials, p is the probability of success, and r is the number of success until $x$th trial. Therefore, to calculate expectation:
$$
E(x) = \sum_{x=r}^{\infty}xp(x)=x\sum_{x=r}^{\infty}\frac{(x-1)!}{(r-1)!(x-r)!}p^r(1-p)^{x-r}=\sum_{x=r}^{\infty}\frac{x!}{(r-1)!(x-r)!}p^r(1-p)^{x-r}
$$
Let $k=x-r$, then the formula becomes:
$$
E(x)=\sum_{k=0}^{\infty}\frac{(k+r)!}{(r-1)!k!}p^r(1-p)^k=
\sum_{k=0}^{\infty}\frac{(k+r)!}{(r-1)!k!}p^r(1-p)^k=
r\sum_{k=0}^{\infty}\frac{(k+r)!}{r!k!}p^r(1-p)^k
$$
By binomial theorem, $\sum_{k=0}^{\infty}\frac{(k+r)!}{r!k!}p^r(1-p)^k$ becomes $[p+(1-p)]^{k+r} = 1$, and thus $E(x) = r$, which is obviously wrong.
I cannot figure out what is wrong with my proof, and thus any help will be appreciated. For reference, someone else has done a similar proof here, but I still have trouble understanding the mistake(s) in my proof:
Best Answer
Here is a purely algebraic approach. We begin by first showing that the PMF for a negative binomial distribution does in fact sum to $1$ over its support. Suppose $X \sim \operatorname{NegBinomial}(r,p)$, with PMF $$\Pr[X = x] = \binom{x-1}{r-1} p^r (1-p)^{x-r}, \quad x = r, r+1, r+2, \ldots.$$ This is the parametrization you chose. Consider the function $$f_m(z) = \sum_{k=0}^\infty \binom{k+m}{m} z^k.$$ We recall the identity $$\binom{k+m}{m} = \binom{k+m-1}{m-1} + \binom{k+m-1}{m},$$ from which it follows that $$\begin{align*} f_m(z) &= \sum_{k=0}^\infty \binom{k+m-1}{m-1}z^k + \binom{k-1+m}{m} z^k \\ &= f_{m-1}(z) + z \sum_{k=1}^\infty \binom{k-1+m}{m} z^{k-1} \\ &= f_{m-1}(z) + z f_m(z). \end{align*}$$ Consequently, $$f_m(z) = \frac{f_{m-1}(z)}{1-z}.$$ But because $$f_0(z) = \sum_{k=0}^\infty \binom{k}{0} z^k = \frac{1}{1-z},$$ it immediately follows that $$f_m(z) = (1-z)^{-(m+1)}.$$ Now letting $m = r-1$, $z = 1-p$, and $k = x-r$, we obtain $$\sum_{x=r}^\infty \Pr[X = x] = p^r (1 - (1-p))^{-(r-1+1)} = p^r p^{-r} = 1, \quad 0 < p < 1.$$ This proves that $\Pr[X = x]$ does define a valid PMF.
Next, we use this property to calculate $\operatorname{E}[X]$. By definition, $$\operatorname{E}[X] = \sum_{x=r}^\infty x \Pr[X = x].$$ But since $$x \binom{x-1}{r-1} = \frac{x!}{(r-1)!(x-r)!} = r \frac{x!}{r! (x-r)!} = r \binom{x}{r},$$ we find $$\operatorname{E}[X] = \sum_{x=r}^\infty r \binom{x}{r} p^r (1-p)^{x-r} = \frac{r}{p} \sum_{x=r+1}^\infty \binom{x-1}{(r+1)-1} p^{r+1} (1-p)^{x-(r+1)},$$ where we obtained this last expression by incrementing the lower index of summation by $1$, and decrementing the index in the summand by $1$. But you will notice that we have also rewritten the summand so that it is now apparent that it is the sum of the PMF of a negative binomial distribution with parameters $r+1$ and $p$. Thus this sum equals $1$, and we conclude $\operatorname{E}[X] = r/p$.
It is worth noting that for this purely algebraic approach, we have spent most of our effort to show that this parametrization is a valid PMF. The calculation of the expectation is quite straightforward by contrast. Also, if the variance is desired, it is best to consider $\operatorname{E}[X(X-1)],$ rather than $\operatorname{E}[X^2]$, since the former expression more readily yields to the same type of binomial coefficient manipulation that we used for $\operatorname{E}[X]$. I leave this computation as an exercise for the reader.
A final word: perhaps the most elegant computation is to exploit the fact that the negative binomial distribution is a generalization (i.e., a sum of IID) geometric random variables. But the purpose of this answer is to show how the computation can be done purely as an algebraic manipulation with very few prerequisites.