Confusion on the proof and meaning of negative binomial random variables

binomial distributiondiscrete mathematicsnegative binomialprobabilitystatistics

Going through the book Introduction to Probability, Statistics and Random Processes, I stumbled across an interpretation of negative binomial random variables and its proof.

Suppose I have a coin with $P(H) = p$. I toss the coin n times until I see m heads. We define X to be the total number of coin tosses in this experiment. Then X is a negative random variable with parameters m and p.

$$X \sim Negative\ Binomial(m,p)$$

Deriving the PMF of a Negative Binomial Random Variable.

Let A be the event we get m heads. Notice that $A = B\cap C$ where B is the event we get a $m-1$ heads in the first k-1 trials and C is the event we get a heads on the $k^{th}$ trial. Since each coin toss is independent,

$$P(A) = P(B\cap C) = P(B)P(C)$$
$$P(C) = p$$

Then the proof finds P(B) by using the binomial distribution formula.
$$P(B) = {k-1\choose m-1}p^{m-1}(1-p)^{(k-1)-(m-1)} = {k-1\choose m-1}p^{m-1}(1-p)^{k-m}$$

Then P(A) is
$$P(A) = P(B)P(C) = {k-1\choose m-1}p^{m}(1-p)^{k-m}$$

I'm not sure how the binomial coefficient ${k-1\choose m-1}$ makes sense. Instead of choosing $m-1$ heads from $k-1$ trials, can't we just choose $m$ heads from $k$ trials such that
$$P(A) = {k\choose m}p^m(1-p)^{k-m}$$

What makes this different from the regular binomial distribution? Looking online, people said that the negative binomial distribution counts the number of failures fixed and the number of trials is not fixed. In the formula for the Negative Binomial distribution I don't see how it's counting failures?

Best Answer

There are at least four formulations of the negative binomial distribution if you consider in your word question:

  • Are you counting tosses or tails until you get $m$ heads? I.e. is the minimum possible value $m$ or $0$?
  • Does heads count as a failure (you go on until $m$ failures) or is tails a failure (you go on until $n$ successes? Which has probability $p$?

If you only count tails, then in a sense you are counting failures rather than attempts. Either way, there is no limit to the number of tails you may see before the $m$th head. So if tails are a failure then there is no limit to the possible number of failures.

Suppose, as you seem to be, you are counting tosses and the probability of a head is $p$. Then for the $k$th toss to be the $m$th head you need $m-1$ of the first $k-1$ tosses to be heads and the $k$th toss to be a head. This has probability ${k-1 \choose m-1}p^{m-1}(1-p)^{(k-1)-(m-1)} \times p$ which is $${k-1 \choose m-1}p^{m}(1-p)^{k - m}$$ and if you sum this from $k=m$ to $\infty$ you should get $1$

Wikipedia has a slightly different formulation but essentially the same result