Solved – Relation Between Wasserstein Distance and KL-Divergence (Relative Entropy)

distancedistributionsentropykullback-leiblerwasserstein

Consider the Wasserstein metric of order one $W_1$ (a.k.a. the Earth Movers Distance). I would like to know whether it is possible to link $W_1$ and Kullback–Leibler divergence (a.k.a. relative entropy) and what this would mean intuitively. I can't find it anymore, but if I am not mistaken the following holds true for some constant $C$
$$
W_1(\mu, \nu)\le \sqrt{C\cdot \text{KL}(\nu ||\mu)},
$$

where $\text{KL}$ is the KL-divergence. My first question would be: Is the above-mentioned inequality true? Secondly, how should one interpret this estimation?

Best Answer

This post gives inequalities for a bunch of distances, including total variation $$\frac{1}{2}d_{TV}(\nu,\mu)<\sqrt{KL(\nu,\mu)}$$ and this says the Wasserstein distance is bounded by the total variation distance $$2W_1(\nu,\mu)\leq Cd_{TV}(\nu,\mu)$$ if the metric is bounded by $C$.

There isn't a simple bound in the other direction, since you can make the KL divergence infinite by moving the probability off an arbitrarily small spot onto the neighbouring area, and this can be done with arbitrarily small $W_1$ distance. For example, take two standard Normals. For one of them, set the density to zero on $[0,\epsilon]$ and to twice the existing value on $[-\epsilon,0]$. Do the opposite for the other one. The Wasserstein distance is proportional to $\epsilon$, but the KL-divergence is infinite.

Related Question