[Math] Kullback-Leibler divergence of binomial distributions

binomial distributionstatistics

Suppose $P \sim \mathrm{Bin}(n,p)$ and $Q \sim \mathrm{Bin}(n,q)$.
Their Kullback-Leibler divergence is defined by
$$D_{KL}(P||Q)=\mathbb{E}_{P}\left[\log\left(\frac{p(x)}{q(x)}\right)\right],$$
with $p(x)$ and $q(x)$ the pdf of $P$ and $Q$ resp. and the expected valued is for $P$. For the case I give above, I can write out the expected values and can also find an answer given by:
$$\log\left(\left(\frac{p}{q}\right)^{np}\right)+\log\left(\left(\frac{1-p}{1-q}\right)^{n-np}\right).$$

But now i have to compute it for $P \sim \mathrm{Bin}(n,p)$ and $Q \sim \mathrm{Bin}(n+1,q)$ en $P \sim \mathrm{Bin}(n,p)$ and $Q \sim \mathrm{Bin}(n-1,q)$.

But then I get problems with the formulas and the expected values. Can someone help we to get the good answer? Thank you

Best Answer

Your second question can be answered precisely: $$\mathbf{D}(\mathrm{Bin}(n,p)\,\|\,\mathrm{Bin}(n-1,q)) = \infty$$ as the second (reference) distribution places 0 mass on $n$, whereas the first distribution places nonzero mass on $n$.

As for your first question, even though the exact expression is too complicated, it can be approximated extremely well with the following expression $$\mathbf{D}(\mathrm{Bin}(n-1,p)\,\|\,\mathrm{Bin}(n,q))\approx n\mathbf{D}\left(\left(1-\frac1n\right)p\,\middle\|\,q\right).$$

To see this, consider the following probabilistic experiment. The random bits $X_1,X_2,\ldots,X_n$ are chosen as follows. First pick a uniform random coordinate $J\in[n]$ and set $X_J = 0$. Pick rest of the coordinates independently 1 with probability $p$ and 0 with probability $1-p$.

The random bits $Y_1,Y_2\ldots,Y_n$ are chosen as so: Pick each to be 1 with probability $q$ and 0 with probablity $1-q$ independently. Observe that $$\mathbf{D}(\mathrm{Bin}(n-1,p)\,\|\,\mathrm{Bin}(n,q)) = \mathbf{D}(X\,\|\,Y).$$

The right hand side is approximated up to an additive $\log n$ by $n\mathbf{D}((1-1/n)p\,\|\,q)$.

By the way, the equality written in the question can also be obtained this way with no calculation. $$\mathbf{D}(\mathrm{Bin}(n,p)\,\|\,\mathrm{Bin}(n,p)) = n\mathbf{D}(p\,\|\,q)$$ Bit string $X$ sampled by setting each of the $n$ bits 1 with probability p independently and bit string $Y$ is sampled the same way, except with probability q. Now $\mathbf{D}(\mathrm{Bin}(n,p)\,\|\,\mathrm{Bin}(n,p)) =\mathbf{D}(X\,\|\,Y)$.