Solved – Is it possible to apply KL divergence between discrete and continuous distribution

distributionskullback-leiblermathematical-statistics

I am not a mathematician. I have searched the internet about KL Divergence. What I learned is the the KL divergence measures the information lost when we approximate distribution of a model with respect to the input distribution. I have seen these between any two continuous or discrete distributions. Can we do it between continuous and discrete or vice versa?

Best Answer

No: KL divergence is only defined on distributions over a common space. It asks about the probability density of a point $x$ under two different distributions, $p(x)$ and $q(x)$. If $p$ is a distribution on $\mathbb{R}^3$ and $q$ a distribution on $\mathbb{Z}$, then $q(x)$ doesn't make sense for points $p \in \mathbb{R}^3$ and $p(z)$ doesn't make sense for points $z \in \mathbb{Z}$. In fact, we can't even do it for two continuous distributions over different-dimensional spaces (or discrete, or any case where the underlying probability spaces don't match).

If you have a particular case in mind, it may be possible to come up with some similar-spirited measure of dissimilarity between distributions. For example, it might make sense to encode a continuous distribution under a code for a discrete one (obviously with lost information), e.g. by rounding to the nearest point in the discrete case.

Related Question