[Math] Kullback-Leibler divergence when the $Q$ distribution has zero values

probability distributionsstochastic-analysisstochastic-approximationstochastic-processes

For discrete probability distributions $P,Q$, the Kullback-Leibler divergence of $Q$ from $P$ is defined to be $$D_{\mathrm{KL}} ( P \mathop{\|} Q ) = \sum_i P(i) \ln \left( \frac{P(i)}{Q(i)} \right).$$

Wikipedia's article on Kullback–Leibler divergence states

The Kullback–Leibler divergence is defined only if $Q(i)=0$ implies $P(i)=0$, for all $i$ (absolute continuity). Whenever $P(i)$ is zero the contribution of the $i$-th term is interpreted as zero because $\lim_{x \to 0} x \ln(x) = 0$.

What if $P(i)$ is very low number and in $Q(i) = 0$, because I just don't have enough samples? If I understand it correctly, the gain of $i$-th element should be $0$?

Best Answer

The sentence "The Kullback–Leibler divergence is defined only if $Q(i)=0$ implies $P(i)=0$ for all $i$" implies that $D_{KL}(P\|Q)$ is not defined if there is some $i$ such that $Q(i)=0$ but $P(i)\not=0.$

One could try to finagle a definition for $D_{KL}(P\|Q)$ in these cases using limits, as is done when $P(i)=0$ for some $P(i)$. The relevant limit (using $p$ and $q$ in place of $P(i)$ and $Q(i)$) is $$\lim_{q\to 0^+}p\ln \frac{p}{q}.$$ But as $q\to0^+$, $\frac{p}{q}\to\infty$, so the limit is infinite.

Some more intuitive explanations for why it's not defined (or why it's infinite) and what to do in those cases are given here.

Best Answer

Related Solutions

[Math] how far the distribution from the uniform distribution

The Kullback-Leibler Divergence Proof of Exact Same Distribution

Related Question