[Math] Probability that given a 1000 page book with 1000 misprints, a page will have 3 misprints.

binomial-coefficientsprobabilityprobability distributions

Setting

A book of 1000 pages contains 1000 misprints. Estimate the chances that a given page contains at least three misprints.

Solution

My solution is

$$\binom{1000}{1}\left(\frac{1}{1000}\right)^3\left(\frac{999}{1000}\right)^{1000 – 3}$$

Please confirm?

Best Answer

Assuming that the misprints are independently distributed across the book, and equally likely to be on any page, the number of misprints $X$ on a given page is binomially distributed with parameters $n = 1000$ ($=$ number of misprints) and $p = 1/1000$ ($=$ probability of a misprint ending up on the given page).

Thus, the probability of there being at least $k$ misprints on the page can be calculated using the cumulative distribution function $F_b$ of the binomial distribution:

$$\begin{aligned} {\rm Pr}(X \ge k)\ &= 1 - {\rm Pr}(X \le k-1) \\ &= 1 - F_b(k-1; n, p) \\ &= 1 - \sum_{i=0}^{k-1} {n \choose i}\, p^i (1-p)^{n-i}, \\ \end{aligned}$$

and thus:

$$\begin{aligned} {\rm Pr}(X \ge 3)\ &= 1 - F_b(2; 1000, \tfrac1{1000}) \\ &= 1 - \sum_{i=0}^{2} {\textstyle {1000 \choose i}}\, \left(\tfrac{1}{1000}\right)^i \left(\tfrac{999}{1000}\right)^{1000-i}, \\ &= 1 - \left(\tfrac{999}{1000}\right)^{1000} - \left(\tfrac{999}{1000}\right)^{999} - \tfrac12 \left(\tfrac{999}{1000}\right)^{999}. \end{aligned}$$

We can approximate this probability by noting that, for $x \approx 0$, $(1-x)^n \approx e^{-nx}$, and thus:

$$\begin{aligned} {\rm Pr}(X \ge 3)\ &\approx 1 - e^{-1} - e^{-\frac{999}{1000}} - \tfrac12 e^{-\frac{999}{1000}} \\ &\approx 1 - \tfrac52 e^{-1} \approx 0.08. \end{aligned}$$

Another way to obtain the same approximation is to note that, for large $n$ and small $p$, the binomial distribution is well approximated by the Poisson distribution with rate parameter $\lambda = np$. The Poisson distribution has the CDF:

$$F_P(k; \lambda) = e^{-\lambda}\sum_{i=0}^k \frac{\lambda^i}{i!},$$

and thus we get:

$$\begin{aligned} {\rm Pr}(X \ge 3)\ &= 1 - F_b(2; 1000, \tfrac1{1000}) \\ &\approx 1 - F_P(2; 1) \\ &= 1 - e^{-1}\sum_{i=0}^2 \frac{1}{i!} \\ &= 1 - \tfrac52e^{-1}. \\ \end{aligned}$$

Related Question