Probability – Difference of Two Binomial Random Variables

probabilityrandom variables

Could anyone guide me to a document where they derive the distribution of the difference between two binomial random variables. So $X \sim \mathrm{Bin}(n_1, p_1) $ and $Y \sim \mathrm{Bin}(n_2, p_2) $, what is the distribution of $|X-Y|$.

thank you.

(Also $X$ and $Y$ are independent)

Best Answer

I can give you an answer for the pmf of X-Y. From there |X - Y| is straightforward.

So we start with

$X \sim Bin(n_1, p_1)$

$Y \sim Bin(n_2, p_2)$

We are looking for the probability mass function of $Z=X-Y$

First note that the min and max of the support of Z must be $(-n_2, n_1)$ since that covers the most extreme cases ($X=0$ and $Y=n_2$) and ($X=n_1$ and $Y=0$).

Then we need a modification of the binomial pmf so that it can cope with values outside of its support.

$m(k, n, p) = \binom {n} {k} p^k (1-p)^{n-k}$ when $k \leq n$ and 0 otherwise.

Then we need to define two cases

  1. $Z \geq 0$
  2. $Z \lt 0$

In the first case

$p(z) = \sum_{i=0}^{n_1} m(i+z, n_1, p_1) m(i, n_2, p_2)$

since this covers all the ways in which X-Y could equal z. For example when z=1 this is reached when X=1 and Y=0 and X=2 and Y=1 and X=4 and Y=3 and so on. It also deals with cases that could not happen because of the values of $n_1$ and $n_2$. For example if $n_1 = 4$ then we cannot get Z=1 as a combination of X=5 and Y=4. In this case thanks to our modified binomial pmf the probablity is zero.

For the second case we just reverse the roles. For example if z=-1 then this is reached when X=0 and Y=1, X=1 and Y=2 etc.

$p(z) = \sum_{i=0}^{n_2} m(i, n_1, p_1) m(i+z, n_2, p_2)$

Put them together and that's your pmf.

$ f(z)= \begin{cases} \sum_{i=0}^{n_1} m(i+z, n_1, p_1) m(i, n_2, p_2),& \text{if } z\geq 0\\ \sum_{i=0}^{n_2} m(i, n_1, p_1) m(i+z, n_2, p_2), & \text{otherwise} \end{cases} $

Here's the function in R and a simulation to check it's right (and it does work.)

https://gist.github.com/coppeliaMLA/9681819

Related Question