Solved – Kullback–Leibler Divergence of two 2-dimensional probability distributions

distributionskullback-leibler

I need to calculate a KL divergence (and others) in the following question.
But I have a hard time to understand the meaning of this syntax. Isn't P a function of (x,y)? I wasn't able to find any source for an answer. Help would be much appreciated.

enter image description here

The measurements are the KL divergence, the JS divergence and the Wasserstein Distance

Best Answer

The notation is wrong, and the wording is confusing. But, we can try to infer what the problem is trying to ask based on context. It seems the point is to compare three ways of measuring distance between probability distributions: KL divergence, the Wasserstein metric, and some other distance (the three "measurements" in part A).

A known property of KL divergence is that $D_{KL}(P \parallel Q)$ is infinite if $P$ is nonzero anywhere where $Q$ is zero (i.e. the support of $P$ isn't contained within the support of $Q$; see here). This is not true of the Wasserstein metric. So, I think the problem is asking to compare the distances for different choices of $P$ and $Q$ where the support overlaps vs. doesn't overlap.

The way the problem describes distributions is wrong, but it seems the intended meaning is that $P$ is the uniform distribution on the region where $x=0$ and $0 \le y \le 1$, and $Q$ is the uniform distribution on the region where $x=\theta$ (some constant) and $0 \le y \le 1$. Notice that the support of $P$ and $Q$ are disjoint, unless $\theta=0$.