Quantitative law of large numbers for non-identically distributed random variables

law-of-large-numbersprobabilityprobability distributionsprobability theoryprobability-limit-theorems

I don't know much about probability so this may be an easy consequence of some well known theorem.

Suppose we have a sequence of independent, $\mathbb{C}$-valued random variables $(\xi_n)_{n\in\mathbb{N}}$ supported in the disk $\overline{D(0,2)}$ (so $P\left(|\xi_n|\leq2\right)=1$ for all $n$) and such that $\xi_n$ has mean $0$ for all $n$.

Then, one can directly check that the mean of $\left|\frac{1}{N}\sum_{n=1}^N\xi_n\right|^2$ is $\frac{1}{N}$. This implies that for each fixed $\varepsilon>0$ and $N\in\mathbb{N}$, $$P\left(\left|\frac{1}{N}\sum_{n=1}^N\xi_n\right|>\sqrt{\varepsilon}\right)\leq f(N):=\frac{1}{\varepsilon N}.$$

Which proves a weak law of large numbers for this sequence of random variables. However, the function $f(N)=\frac{1}{\varepsilon N}$ decreases very slowly. Is it possible to obtain an upper bound function $f(N)$ which decreases much faster? Maybe exponentially?

Best Answer

Yes, an exponential upper bound is possible by an elementary application of Hoeffding's inequality : indeed, if $(\xi_n)_{n\ge1} $ is a family of centered, independent random variables with modulus bounded by $2$, it follows that both $(\Re\xi_n)_{n\ge 1}$ and $(\Im\xi_n)_{n\ge 1}$ are respectively centered, independent random variables with absolute value bounded by 2.

So now, if we denote $S_N := \sum_{n=1}^N\xi_n$, we have $$\begin{align}\mathbb P[|S_N|> N\sqrt\varepsilon]&\le \mathbb P[|\Re S_N| + |\Im S_N|> N\sqrt\varepsilon]\tag 1\\ &\le\mathbb P[|\Re S_N| > N\sqrt\varepsilon/2] + \mathbb P[|\Im S_N|> N\sqrt\varepsilon/2]\tag 2\\ &\le 4\exp\left(-\frac{2N^2\varepsilon}{4\sum_{n=1}^N(4)^2}\right)\tag3\\ &= 4\exp\left(-\frac{1}{32}N\varepsilon\right)\tag4\end{align} $$ Where

  • in $(1)$ we used the inequality $\sqrt{a+b}\le\sqrt a + \sqrt b $ with $a := (\Re S_N)^2$, $b:=(\Im S_N)^2$ and the fact that if $X\le Y$, then we have the inclusion of events $\{X>\kappa\}\subseteq\{Y>\kappa\}$,
  • in $(2)$ we used the inclusion $\{X+Y \ge \kappa\}\subseteq \{X \ge \kappa/2\}\cup\{Y \ge \kappa/2\}$ and the union bound,
  • in $(3)$ we used Hoeffding's inequality.

Note that the bound $(4)$ we obtained in this way is not exactly sharp. For sharper bounds, a trick you can use is representing your complex numbers as $2\times2$ matrices and applying some kind of matrix concentration inequalities (see also this paper by Joel Tropp for a review).

While googling, I also found this generalization of Hoeffding's inequality to complex-valued random variables due to Isaev and McKay (2016), which could possibly improve bound $(4)$ as well :

Theorem : If $Z$ is a complex random variable with $|Z-\mathbb EZ|\le\alpha $ a.s., then $$\left|\mathbb E e^{Z-\mathbb E Z} - 1\right|\le e^{\alpha^2/2}-1 $$

Related Question