Constructing the $c$-Transform – Optimal Transport Theory

optimal-transport

Disclaimer

This thread is meant to record. See: SE blog: Answer own Question and MSE meta: Answer own Question.

Anyway, it is written as problem. Have fun! 🙂


Let $X,Y$ be Polish spaces and $c:X \times Y \to \mathbb [0, +\infty)$ lower semi-continuous. We fix Borel probability measures (b.p.m.) $\mu \in \mathcal P(X)$ and $\nu \in \mathcal P(Y)$.

  • Let $\Phi_c$ the collection of $(\varphi: X \to \mathbb R, \psi: Y \to \mathbb R) \in L_1(\mu) \times L_1(\nu)$ such that $\varphi(x)+\psi(y) \le c(x,y)$ for $\mu$-a.e. $x\in X$ and $\nu$-a.e. $y\in Y$.

  • Let
    $$
    J (\varphi, \psi) := \int_X \varphi d \mu + \int \psi d \nu \quad \forall (\varphi, \psi) \in \Phi_c.
    $$

  • Now assume a reglarity condition that there are $c_X: X \to \mathbb R$ and $c_Y: Y \to \mathbb R$ such that $c_X \in L_1(\mu), c_Y \in L_1 (\nu)$, and $c (x, y) \le c_X(x)+c_Y(y)$ for $\mu$-a.e. $x\in X$ and $\nu$-a.e. $y\in Y$.

Then for each $(\varphi, \psi) \in \Phi_c$, there is an improved pair $(\varphi', \psi') \in \Phi_c$ such that

  • $J (\varphi, \psi) = J (\varphi', \psi')$.
  • $\varphi' (x) \le c_X (x)$ and $\psi' (y) \le c_Y (y)$ for $\mu$-a.e. $x\in X$ and $\nu$-a.e. $y\in Y$.

Such pair is constructed by the method of $c$-transform and is used to construct an optimal solution of the dual problem in Kantorovich duality.

Best Answer

Fix $(\varphi, \psi) \in \Phi_c$.

  • There are a $\mu$-null set $N_x$ and $\nu$-null net $N_y$ such that $\varphi(x)+\psi(y) \le c(x,y)$ for all $(x,y) \in N_x^c \times N_y^c$.
  • We re-define $\varphi, \psi$ by letting them take value $-\infty$ on $N_x, N_y$ respectively. In this way, $\varphi(x)+\psi(y) \le c(x,y)$ for all $(x,y) \in X \times Y$.

We define $\varphi^c$ by $$ \varphi^c (y) := \inf_{x\in X} [c(x,y) - \varphi(x)]. $$

  • The infimum of a collection of extended real-valued measurable functions is again a measurable function, so $\varphi^c$ is measurable.
  • There is $x_0 \in N^c_x$, so $\varphi^c (y) \le c(x_0,y)-\varphi(x_0) <+\infty$ for all $y \in Y$.
  • Given $y\in Y$, $\varphi(x)+\psi(y) \le c(x,y)$ for all $x \in X$, so $\varphi^c (y) \ge \psi(y)$ for all $y\in Y$.

We define $\varphi^{cc}$ by $$ \varphi^{cc} (x) := \inf_{y\in Y} [c(x,y) - \varphi^c (y)]. $$

For all $x\in X$, $$ \begin{align} \varphi^{cc} (x) &= \inf_{y\in Y} \left [c(x,y) - \inf_{z\in X} [c(z,y) - \varphi(z)] \right] \\ &= \inf_{y\in Y} \left [c(x,y) + \sup_{z\in X} [-c(z,y) + \varphi(z)] \right] \\ &= \inf_{y\in Y} \sup_{z\in X} [c(x,y) -c(z,y) + \varphi(z)] \\ &\ge \inf_{y\in Y} \varphi(x) =\varphi(x) \quad \text{by picking} \quad z=x. \end{align} $$

There is $y_0 \in N_y^c$. Then $$ \varphi^{cc} (x) \le c(x, y_0) - \varphi^c (y_0) \le c(x, y_0) - \psi(y_0) <+\infty \quad \forall x\in X. $$ For all $(x,y) \in X \times Y$, $$ \varphi^{cc} (x) + \varphi^c (y) = \inf_{z\in Y} [c(x,z) - \varphi^c(z)] + \varphi^c (y) \le [c(x,y) - \varphi^c (y)] + \varphi^c (y)= c(x,y). $$


  • There are a $\mu$-null set $M_x$ and $\nu$-null set $M_y$ such that $c (x, y) \le c_X(x)+c_Y(y)$ for all $(x, y) \in M_x^c \times M_y^c$.
  • We re-define $c_X, c_Y$ such that $c_X (x) = c_Y(y) := +\infty$ for all $(x, y) \in M_x \times M_y$. In this way, $c(x,y) \le c_X(x)+c_Y(y)$ for all $(x,y) \in X \times Y$.

Let $$ a := \inf_{y\in Y} [c_Y(y) - \varphi^c (y)]. $$

First, $$ a \le c_Y(y_0) - \varphi^c (y_0) \le c_Y(y_0) - \psi (y_0) < +\infty. $$

There is $x_1 \in (N_x \cup M_x)^c$, so $$ \begin{align} c_Y(y) - \varphi^c (y) &= c_Y(y) - \inf_{x\in X} [c(x,y) - \varphi(x)] \\ &= \sup_{x\in X} [c_Y(y)-c(x,y) + \varphi(x)] \\ &\ge \sup_{x\in X} [\varphi(x)-c_X(x)] \\ &\ge \varphi(x_1)-c_X(x_1) \\ &> -\infty. \end{align} $$

It follows that $a \in \mathbb R$. Clearly, $a \le c_Y(y) - \varphi^c (y)$ for all $y\in Y$, so $\varphi^c+a \le c_Y$. We have $$ \begin{align} (\varphi^{cc} (x)-a) - c_X(x) &= \inf_{y\in Y} [c(x,y) - \varphi^c (y)]-a - c_X(x) \\ &= \inf_{y\in Y} [(c(x,y)-c_X(x)) - \varphi^c (y)] -a \\ &\le \inf_{y\in Y} [c_Y(y) - \varphi^c (y)] -a \\&=0. \end{align} $$

Clearly, $(\varphi', \psi') := (\varphi^{cc}-a, \varphi^c+a)$ satisfies the requirement.

Related Question