Kantorovich-Rubinstein theorem

calculus-of-variationsoptimal-transport

This thread is meant to record a question that I feel interesting during my self-study. I'm very happy to receive your suggestion and comments.


Let $X,Y$ be topological spaces. Let $\mathcal P(X), \mathcal P(Y)$ be the spaces of all Borel probability measures on $X,Y$ respectively. Let $c: X \times Y \to [0, +\infty]$. Fix $\mu \in \mathcal P(X)$ and $\nu \in \mathcal P(Y)$.

  • $\Pi(\mu, \nu)$ is the set of $\pi \in \mathcal P(X \times Y)$ such that for all measurable subsets $A \subset X$ and $B \subset Y$,
    $$
    \pi (A \times Y) = \mu (A) \quad \text{and} \quad \pi (X \times B) = \nu (B).
    $$

  • $\Phi_{c}$ is the set of all $(\varphi, \psi) \in L_1 (\mu) \times L_1 (\nu)$ satisfying
    $$
    \varphi(x)+\psi(y) \leq c(x, y)
    $$

    for $\mu$-a.e. $x \in X$ and $\nu$-a.e. $y \in Y$.

  • For $\pi \in \mathcal P(X \times Y)$ and $(\varphi, \psi) \in L_1 (\mu) \times L_1 (\nu)$, let
    $$
    \mathbb K (\pi) := \int_{X \times Y} c d \pi \quad \text{and} \quad \mathbb J(\varphi, \psi) := \int_{X} \varphi d \mu+\int_{Y} \psi d \nu .
    $$

Kantorovich-Rubinstein: Let $X = Y$ be Polish spaces and and $c$ a lower semi-continuous metric. Define
$$
\|f\|_{\mathrm{Lib}} = \sup_{x\neq y} \frac{|f(x) – f(y)|}{c(x, y)}.
$$

Assume that there exists $(a,a) \in X \times Y$ such that $c( \cdot, a) \in L_1(\mu)$ and $c(a, \cdot) \in L_1(\nu)$. Then
$$
\min_{\pi \in \Pi (\mu, \nu)} \mathbb K (\pi) = \sup \left \{ \int_X f d (\mu – \nu) \,\middle\vert\, f \in L_1 (|\mu- \nu|) , \|f\|_{\mathrm{Lib}} \le 1 \right \}.
$$

Best Answer

  1. $c$ is bounded, i.e., $\sup_{x, y\in X} c(x, y) < C$ for some $C \in \mathbb R$.

Then $\|f\|_{\mathrm{Lib}} \le 1$ implies $f$ is bounded, and thus $f \in L_1 (|\mu- \nu|)$ and $(-f, f) \in \Phi_{c}$. By Kantorovich duality, it suffices to prove $$ \sup _{\Phi_{c}} \mathbb J(\varphi, \psi) = \sup \left \{ \int_X f d (\mu - \nu) \,\middle\vert\, f \in L_1 (|\mu- \nu|) , \|f\|_{\mathrm{Lib}} \le 1 \right \}. $$

If $f \in L_1 (\mu)$, then $$ f^c:Y \to \mathbb R \cup \{\pm \infty\}, y \mapsto \inf_{x \in X} [c(x, y) - f(x)] $$ is such that $\|f^c\|_{\mathrm{Lib}} \le 1$ and $f^{cc} = -f^c$. By this lemma, we have $$ \sup _{\Phi_{c}} \mathbb J(\varphi, \psi) = \sup _{f \in L_1 (\mu)} \mathbb J(f^{cc}, f^c) = \sup _{f \in L_1 (\mu)} \mathbb J(-f^{c}, f^c) \le \sup _{\|f\|_{\mathrm{Lib}} \le 1} \mathbb J(-f, f) \le \sup _{\Phi_{c}} \mathbb J(\varphi, \psi). $$

  1. $c$ is unbounded.

We have $$ \sup \left \{ \int_X f d (\mu - \nu) \,\middle\vert\, f \in L_1 (|\mu- \nu|) , \|f\|_{\mathrm{Lib}} \le 1 \right \} \le \sup _{\Phi_{c}} \mathbb J(\varphi, \psi). $$

It remains to prove $$ \sup \left \{ \int_X f d (\mu - \nu) \,\middle\vert\, f \in L_1 (|\mu- \nu|) , \|f\|_{\mathrm{Lib}} \le 1 \right \} \ge \sup _{\Phi_{c}} \mathbb J(\varphi, \psi). $$

Let $c_n := \min\{n, c\}$. Then $c$ is a bounded metric on $X$ such that $c_n \nearrow c$. By this result, we have $$ \lim_n \inf_{\pi \in \Pi(\mu, \nu)} \int c_n d\pi = \inf_{\pi \in \Pi(\mu, \nu)} \int c d\pi = \sup _{\Phi_{c}} \mathbb J(\varphi, \psi). $$

On the other hand, $$ \begin{align} \inf_{\pi \in \Pi (\mu, \nu)} \int_{X \times Y} c_n d \pi &= \sup \left \{ \int_X f d (\mu - \nu) \,\middle\vert\, f \in L_1 (|\mu- \nu|) , \|f\|_{\mathrm{Lib}, c_n} \le 1 \right \} \\ &\le \sup \left \{ \int_X f d (\mu - \nu) \,\middle\vert\, f \in L_1 (|\mu- \nu|) , \|f\|_{\mathrm{Lib}} \le 1 \right \}. \end{align} $$

This completes the proof.