Lipschitz inequality with Rademacher variable

inequalityprobabilityprobability distributionsstatistics

Suppose $(\epsilon_i)_{i = 1}^n$ are IID Rademacher variables. And we have another series of rv.s $(X_i)_{i = 1}^n$ independent with $(\epsilon_i)_{i = 1}^n$ with a Lipschitz function $|f(x, z_1) – f(x, z_2)| < L(x) |z_1 – z_2|$. What I want to do is to bound the expectation:
$$
\mathrm{E} \bigg| \sum_{i = 1}^n \epsilon_i \big( f(X_i, z_1) – f(X_i, z_2) \big) \bigg|.
$$

First, I simply use the fact $ \big( f(X_i, z_1) – f(X_i, z_2) \big) \leq L(X_i) |z_1 – z_2|$ to give the upper bound $|z_1 – z_2| \cdot \mathrm{E} \bigg| \sum_{i = 1}^n \epsilon_i L(X_i)\bigg|$. But later I found that this may not be held since $\epsilon_i$ can be either positive or negative.

So I wonder if there exist some constant $C$ such that
$$
\mathrm{E} \bigg| \sum_{i = 1}^n \epsilon_i \big( f(X_i, z_1) – f(X_i, z_2) \big) \bigg| \leq C \cdot |z_1 – z_2| \cdot \mathrm{E} \bigg| \sum_{i = 1}^n \epsilon_i L(X_i)\bigg|
$$

Best Answer

I claim that in general, if $a_1, \ldots, a_n$ and $b_1, \ldots, b_n$ are any real numbers such that $|a_i|\le |b_i|$ for all $i$, then $$ \mathsf E\left|\sum_{i=1}^n \varepsilon_i a_i\right|\le \mathsf E\left|\sum_{i=1}^n \varepsilon_i b_i\right|.\tag{1} $$


To prove (1), first note that for all real numbers $x,y,z$ such that $|y|\le |z|$: $$ |x+y|+|x-y|\le |x+z|+|x-z|.\tag{2} $$ (Indeed, you can check that both sides are unchanged if you replace $x,y,z$ by their absolute values, and the inequality is quite easy to prove when $x,y,z\ge 0$.)

Now we use (2) to bound $$ \mathsf E\left|\sum_{i=1}^n \varepsilon_i a_i\right|=\mathsf E\left|\sum_{i=1}^{n-1} \varepsilon_i a_i+a_n\right|+\left|\sum_{i=1}^{n-1} \varepsilon_i a_i-a_n\right|\le \mathsf E\left|\sum_{i=1}^{n-1} \varepsilon_i a_i+b_n\right|+\left|\sum_{i=1}^{n-1} \varepsilon_i a_i-b_n\right|, $$ and the claim should follow by induction.


EDIT: In fact, I just noticed the following slightly more elegant argument. Observe that $$ (\alpha_1,\ldots,\alpha_n)\in[-1,1]^n\mapsto \mathsf E\left|\sum_{i=1}^n\varepsilon_i \alpha_i b_i\right|. $$ is a convex function, so it attains its maximum at some point $(\alpha_1,\ldots,\alpha_n)\in\{-1,1\}^n$. But since $(\varepsilon_1, \ldots, \varepsilon_n)$ and $(\alpha_1\varepsilon_1,\ldots,\alpha_n\varepsilon_n)$ have the same distribution, the claim follows. Note that the proof can be readily generalized if $b_1, \ldots, b_n$ live in some Banach space (see Ledoux-Talagrand book).


You can readily check that applied conditionally on $X_1, \ldots, X_n$, (1) implies your last inequality, with $C=1$: $$ \mathsf E\left|\sum_{i=1}^n \varepsilon_i (f(X_i,z_1)-f(X_i,z_2))\right|\le |z_1-z_2|\mathsf E\left|\sum_{i=1}^n \varepsilon_i L(X_i)\right|. $$