[Math] 1-wasserstein distance v.s. total variation distance

pr.probabilityprobability distributionsst.statistics

Suppose that $\mu_1$ and $\mu_2$ are two distributions defined on $\mathbb{R}^n$ and $\gamma$ is a symmetric distribution (around $0$) on $\mathbb{R}^n$ with compact support. Let $\gamma_x$ denote the resulting distribution by translating the centre of $\gamma$ from $0$ to $x$, $d_{TV}(\cdot,\cdot)$ denote the total variation distance and $d_W(\cdot,\cdot)$ 1-Wasserstein distance.

Question: Does it hold that
$$
d_{TV}(\mu_1\ast\gamma,\mu_2\ast\gamma) \leq \left(\sup_{x\neq y} \frac{d_{TV}(\gamma_x,\gamma_y)}{\|x-y\|} \right)\cdot d_W(\mu_1,\mu_2)?
$$

Best Answer

Yes.

I presume that your "1-Wasserstein" distance is what is otherwise called the transportation metric.

Let $M$ be any measure with marginals $\mu_1$ and $\mu_2$, so that $\mu_1=\int \delta_x\,dM(x,y)$ and $\mu_2=\int\delta_y\,dM(x,y)$, whence $\mu_1-\mu_2=\int (\delta_x-\delta_y)\,dM(x,y)$, which, after taking convolution with $\gamma$ and passing to the total variation norm, yields $$ \| \mu_1*\gamma-\mu_2*\gamma \| \le \int \| \gamma_x-\gamma_y \|\,dM(x,y) \le K \int \|x-y\|\,dM(x,y) \;, $$ where $K$ is the $\sup$ from your question (none of the additional conditions on $\gamma$ are required). Taking for $M$ a measure which realizes the transportation distance between $\mu_1$ and $\mu_2$, one gets the claim.

Related Question