Measure Theory – Wasserstein Distances Metrize Weak Convergence

measure-theorymetric-spacesoptimal-transportreference-requestweak-convergence

Let $(X,d)$ be a metric space. Let $P(X)$ denote the space of probability measures on $X$. The (first) Wasserstein distance is a distance on $P(X)$ given by:
$$
d_W(p,q):= \inf_{m\in\Gamma(p,q)} \int_{X\times X} d(x,y)\, dm(x,y)\;,
$$
where $\Gamma(p,q)$ is the set of measures on $X\times X$ which have marginals $p$ and $q$ respectively on the first and second component.

In the literature (for example, Villani, "Topics in optimal transportation"), I've always read that if $X$ is Polish, then $d_W$ metrizes the weak convergence. However, is separability of $X$ really necessary?

Best Answer

If $X$ is not separable then the topology of weak convergence on $P(X)$ may not be metrizable at all.

I don't know of a simple ZFC example, but for instance, suppose $X$ is a measurable cardinal with $d$ the discrete metric. Let $\mu$ be the corresponding 0-1 valued probability measure, and let $C$ be the set of "atomic" probability measures, i.e. those with countable support. It's not hard to show that $C$ is weakly dense in $P(X)$ (this is true for any metric space), but in this space $C$ is also weakly sequentially closed. Yet $\mu \notin C$.

Related Question