Solved – Continuous approximation to binomial distribution

approximationbinomial distributionnormal distribution

Consider an integer variable $k$ that follows a binomial distribution,

$$\binom{N}{k}p^{k}\left(1-p\right)^{N-k}$$

with total draws $N$ and probability of success $p$. I am interested in the fraction of successes, $f = k/N$. The distribution of $f$ can be approximated as a normal distribution with mean $p$ and variance $\sqrt Np(1-p)$. This approximation is good if both $Np$ and $N(1-p)$ are sufficiently large, but it has the problem that there is a finite (though small) probability that $f<0$ or $f>1$, because the normal distribution has infinite support.

Is there an approximation to the distribution of $f$ that has support $f\in[0,1]$? We can assume that $Np$ and $N(1-p)$ are both sufficiently large.

Best Answer

An obvious candidate would be the beta distribution, since this is the conjugate to the binomial distribution and it is on the appropriate support. To allow for continuity correction and avoid poor approximation at the edges, it is desirable to approximate each discrete $x = 0, 1, ...., N$ by an equal-sized continuous interval. This means that we approximate the binomial mass function by an integral of the beta density over one of $N+1$ equal-sized intervals on its support:

$$\text{Bin}(x|N,p) \approx \int \limits_{B(x)}^{B(x+1)} \text{Beta}(\theta|\alpha, \beta)d\theta \quad \quad \quad B(x) \equiv \frac{x}{N+1}.$$

The approriate parameters $\alpha$ and $\beta$ can be found using the method-of-moments (MOM), which requires us to solve the following two moment equations:

$$\frac{\alpha}{\alpha+\beta} = p \quad \quad \quad \frac{\alpha \beta}{(\alpha+\beta)(\alpha+\beta+1)} = \frac{p(1-p)}{N+1}.$$

Solving for the required parameters yields the values $\alpha = pN$ and $\beta = (1-p)N$ so our approximation to the binomial is:

$$\text{Bin}(x|N,p) \approx \frac{\Gamma(N)}{\Gamma(pN)\Gamma((1-p)N)} \int \limits_{B(x)}^{B(x+1)} \theta^{pN-1} \theta^{(1-p)N-1} d\theta.$$