[Math] Concentration bounds on weighted sum of i.i.d. Bernoulli random variables

measure-concentrationpr.probability

Let $X_1,\dots, X_n\sim\operatorname{Bern}(\frac{1}{2})$ be independent, identically distributed random variables, and $\alpha=(\alpha_1,\dots,\alpha_n)\in[0,1]^n$ a vector of non-negative weights satisfying $\sum_{k=1}^n \alpha_k = 1$.

Define the random variable $X\stackrel{\rm def}{=} \sum_{k=1}^n \alpha_k X_k$. Given parameter $\delta > 0$, what is the best concentration bounds I get on $X$ in terms of the $p$-'norms' (allowing $p\in(0,1)$) of $\alpha$, i.e.
$$
\mathbb{P}\{ X \leq \delta\} \leq \varphi(\delta, (\lVert\alpha\rVert_p)_{p>0})
$$
?

I was thinking of using Berry—Esseen-like approximations to reduce this to studying a Gaussian random variable (leading to a bound in terms of $\frac{1}{\lVert \alpha\rVert_2^2}$, but this falls short of what I want (as, if I am not mistaken, this requires the vector $\alpha$ to be more or less uniform). Using Chernoff/Hoeffding bounds will get me a bound expressed in terms of $\frac{1}{\lVert \alpha\rVert_\infty}$, but it feels very loose.


Edit: using the conversion Bernoulli/Rademacher and the bounds for those, I get an upper bound of $e^{-\frac{(1-2\delta)^2}{\lVert \alpha\rVert_2^2}}$, similar to what the Gaussian approximation would result in. But I still don't really know if this is tight, that is if the "right" parameter for this is indeed $\lVert \alpha\rVert_2$, or something more "exotic" that just happens to coincides with $\lVert \alpha\rVert_2$ in the "uniform-weights regime."

Best Answer

An almost complete answer. First of all, indeed the $\lVert \alpha\rVert_2$-based bound mentioned in the question can be shown to be tight for many "simple" $\alpha$'s, such as balanced, or uniform/balanced on a subset of $m$ coordinates. However, it is not tight in general, and the right answer appears to be captured by a quantity, the $K$-functional between $\ell_1$ and $\ell_2$.

Converting the problem to that of looking at a weighted sum $Y\stackrel{\rm def}{=}\sum_{k=1}^n \alpha_k Y_k$ of Rademacher random variables $Y_1,\dots, Y_n$ (where $Y_k = 1-2X_k$), $$ \mathbb{P}\{ X \leq \varepsilon \} = \mathbb{P}\{ Y \geq 1-2\varepsilon \}. $$ In Montgomery-Smith [1], the following is shown:

Theorem. For $c\stackrel{\rm def}{=} \frac{1}{4\ln 12}\simeq \frac{1}{10}$, we have $$\begin{align} \mathbb{P}\{ Y > K_{1,2}(\alpha,t) \} &\leq e^{-\frac{t^2}{2}}, \tag{1} \\ \mathbb{P}\{ Y > cK_{1,2}(\alpha,t) \} &\geq ce^{-\frac{t^2}{c}} \tag{2} \end{align}$$ where $K_{1,2} (\alpha,t) \stackrel{\rm def}{=} \inf\{\lVert \alpha'\rVert_1+t\lVert \alpha''\rVert_2 \ \colon\ \alpha',\alpha'' \in \ell_2,\ \alpha'+\alpha'' = \alpha\}$.

Moreover, the following bound holds for $K_{1,2}(\alpha,t)$, ([1], citing Theorem 4.2 of [2]): assuming wlog that $\alpha_1 \geq \alpha_2 \geq \dots\geq \alpha_n$, $$ \gamma K_{1,2}(\alpha,t) \leq \sum_{k=1}^{\lfloor t^2\rfloor} \alpha_k + \left(\sum_{k=\lfloor t^2\rfloor+1}^n \alpha_k^2\right)^{\frac{1}{2}} \leq K_{1,2}(\alpha,t) \tag{3} $$ for some absolute constant $\gamma \in (0,1)$.

This gives the following corollary, that can be found in [1]:

Corollary. With $c$ as above, we have that for all $\alpha\in\ell_2$ (which is trivial in our case as $\alpha$ has finite support) and $0 < t \leq c\frac{\lVert \alpha\rVert_2^2}{\lVert \alpha\rVert_\infty}$, $$\begin{align} ce^{-\frac{t^2}{c^3\lVert \alpha\rVert_2^2}} \leq \mathbb{P}\{ Y >t \} \leq e^{-\frac{t^2}{2\lVert \alpha\rVert_2^2}}, \tag{4} \end{align}$$ where the upper bound holds for all $t>0$.

Since the range of interest is $t\in(0,1]$, the above shows that the $e^{-\Theta\left(\frac{t^2}{\lVert \alpha\rVert_2^2}\right)}$ dependence is tight as long as $\lVert \alpha\rVert_2^2$ and $\lVert \alpha\rVert_\infty$ are within constant factors, which happens to be the case for $\alpha$ roughly balanced or uniform on a subset of coordinates, for instance.

Now, for a counter example, one can consider (a variant of) the "truncated random Harmonic series," where we set, for $1\leq k\leq n$, $$ \alpha_k \stackrel{\rm def}{=} \frac{1}{k H_n} \operatorname*{\sim}_{n\to \infty} \frac{1}{k\ln n}. $$

Then, using techniques similar as [1], one can show that $$ \mathbb{P}\{ Y > t \} \leq e^{-\Theta(n^t)} $$ for $t\in(0,1)$. However, $ \lVert \alpha\rVert_2^2 {\displaystyle\operatorname*{\sim}_{n\to\infty}} \frac{\pi^2}{6\ln^2 n} $, so (4) would only yield $$ \mathbb{P}\{ Y > t \} \leq e^{-\Theta( t^2 \ln^2 n )}. $$

Note that (sanity check) in this case, $\lVert \alpha\rVert_\infty = \frac{1}{H_n}$, so the lower bound of (4) does not apply: $\frac{\lVert \alpha\rVert_2^2}{\lVert \alpha\rVert_\infty} {\displaystyle\operatorname*{\sim}_{n\to\infty}} \frac{1}{\ln n} = o(1)$.


Edit: based on results from [3] (adapting and generalizing the proof of Theorem 2.2), one can get the following refinement of (2) above:

Theorem. Fix any $c>0$. Then, for any $\alpha\in\ell_1+\ell_2$, we have that for all $t\geq 1$, $$ \mathbb{P}\left\{ Y \geq \frac{1}{1+c}K_{1,2}(\alpha,t) \right\} \geq e^{-\left( \frac{2}{c}\ln\frac{\sqrt{6}(1+c)}{c}\right) (t+c)^2 }. $$ In particular, $$ \mathbb{P}\left\{ Y \geq \frac{1}{2}K_{1,2}(\alpha,t) \right\} \geq e^{-\left( \ln 24 \right) (t+1)^2 } \geq e^{-\left( 4\ln 24 \right)t^2 }. $$


[1] Montgomery-Smith S.J. (1990) The distribution of Rademacher sums. Proc. Amer. Math. Soc. 109:517522

[2] Holmstedt, Tord. (1970) Interpolation of quasi-normed spaces. Math. Scand. 26, 177–199.

[3] Astashkin, S.V. (2010) Rademacher functions in symmetric spaces Journal of Mathematical Sciences, 169(6):725–886

Related Question