[Math] Is an ambiguity set with Wasserstein distance of order 1 is convex

convex-geometryconvexityoptimal-transportation

I have a question about the convexity of an Wasserstein ambiguity set.

Let $W_1(\mu, \nu)$ be the Wasserstein distance of order 1 between $\mu$ and $\nu$, defined as
$$W_1(\mu, \nu) := \min\limits_{\gamma \in \Gamma(\mu, \nu)} \bigg \{ \int_{\Xi \times \Xi} d(\xi, \zeta) \gamma(d\xi, d\zeta) \bigg \}, $$ where $\Gamma(\mu, \nu)$ denotes the set of all probability measures on $\Xi \times \Xi$ with marginals $\mu$ and $\nu$.

Let $\nu$ be the empirical distribution. The Wasserstein ambiguity set $\mathcal{M}$ is defined by $$\mathcal{M} := \{ \mu \in \mathcal{P}(\Xi) : W_1(\mu, \nu) \leq \theta \},$$ where $\theta$ is a given radius.

I am curious about whether the set $\mathcal{M}$ is convex. I notice that Wasserstein distance satisfies the triangle inequality, but I'm not sure that the set $\mathcal{M}$ is convex.

Is the Wasserstein ambiguity set of order 1 always convex?

Best Answer

Edit: (This question may be better suited for math.stackexchange.)

This is true. It suffices to show the map $\mu\rightarrow W_{1}(\mu,\nu)$ is convex. Let $\mu_{i}\in\mathcal{P}(\Xi)$ and $\psi^{*}_i\in\Gamma(\mu_{i},\nu)$ be optimal transport plans between $\mu_{i}$ and $\nu$ for each $i=1,2$. Then $\lambda\psi_{1}^{*}+(1-\lambda)\psi_{2}^{*}\in\Gamma(\lambda\mu_{1}+(1-\lambda)\mu_{2},\nu)$ and so

$W_{1}(\lambda\mu_{1}+(1-\lambda)\mu_{2},\nu)=\inf\limits_{\gamma\in\Gamma(\lambda\mu_{1}+(1-\lambda)\mu_{2},\nu)} \int d(x,y) d\gamma(x,y)\leq \\ \int d(x,y) d(\lambda\psi_{1}^{*}+(1-\lambda)\psi_{2}^{*})(x,y) = \lambda W_{1}(\mu_{1},\nu)+(1-\lambda)W_{1}(\mu_{2},\nu)$

which gives convexity. Thus, $$W_{1}(\lambda\mu_{1}+(1-\lambda)\mu_{2},\nu)\leq \lambda W_{1}(\mu_{1},\nu)+(1-\lambda)W_{1}(\mu_{2},\nu)\leq \theta.$$

When restrictions are made such as replacing $\mathcal{P}(\Xi)$ with some other set of probability measures, as is done in the paper here, the answer can be no:

[For $\mathcal{P}(\Xi)$ replaced with $\mathcal{N}(\Xi)$, Gaussian distributions]...the Wasserstein ambiguity set $P$ is nonconvex (as mixtures of normal distributions are generically not normal).

Related Question