[Math] A maximal Hoeffding’s inequality

probabilityprobability theorystatistics

Let $X_1, \cdots, X_n$ be real-valued independent random variables satisfying $|X_k|\le 1$ and $\mathbb EX_k=0$. Hoeffding's inequality tells us that for any $k=1,\cdots, n$ and $t>0$,
$$\mathbb P\Big( \Big | \frac{X_1+\cdots+ X_k}{\sqrt{k}} \Big | \ge t \Big) \le 2 e^{-t^2/2}.$$

My question is whether there exists a similar bound for the maximum over $k$. More precisely:

Question: Do there exist absolute constants $C>0$ and $A>0$ so that
$$\mathbb P\Big( \max_{1\le k\le n} \Big | \frac{X_1+\cdots+ X_k}{\sqrt{k}} \Big | \ge t \Big) \le C e^{-t^2/A}$$
holds for all $t>0$? If not what can we say about the left hand side?

Best Answer

Hoeffding's inequalities for absolute values are derived by determining first the bound for the value, and then double it to arrive at a bound for the absolute value. But the question here asks for a bound related to the maximum of the absolute value, not to the absolute value of the maximum, so a direct examination of the absolute value is needed.
To compact notation, we define $S_k=\Big | X_1+\cdots+ X_k \Big |$. Then we are examining

$$\mathbb P\Big( \max_{1\le k\le n} \Big | \frac{S_k}{\sqrt{k}} \Big | \ge t\Big) = \mathbb P\Big( \bigcup_k \Big\{ \Big|\frac{S_k}{\sqrt{k}} \Big | \ge t \Big\} \Big)$$

Denote $I_Z$ the indicator function of the event $Z= \Big( \bigcup_k \Big\{ \Big|\frac{S_k}{\sqrt{k}} \Big |\ge t \Big\} \Big)$ and $I_1,..., I_n$ the indicator functions of the events, $\Big\{ \Big|\frac{S_1}{\sqrt{1}} \Big |\ge t \Big\},...\Big\{ \Big|\frac{S_n}{\sqrt{n}} \Big |\ge t \Big\} $, respectively. Then by the properties of indicator functions $$I_Z=\max\Big \{I_1,...,I_n\Big \}$$ Now if $I_Z =0$ it means that all $I_i,\; i=1,...,n$ are zero. If $I_Z=1$, it means that at least one of these individual indicator functions is unity, denote it $I_m$, and we have $$I_m =1 \Rightarrow \Big|\frac{S_m}{\sqrt{m}} \Big |\ge t $$ Let $v = \text{argmax}_k \Big | \frac{S_k}{\sqrt{k}} \Big |$ and $\Big | \frac{S_v}{\sqrt{v}} \Big |$ the corresponding variable. Then in the case where $I_Z =1$ we have

$$\Big | \frac{S_v}{\sqrt{v}} \Big |\ge \Big|\frac{S_m}{\sqrt{m}} \Big |\ge t$$

Then in all cases, either when $I_Z=0$ or when $I_Z=1$ we have that, for some $h>0$,

$$I_Z \le \exp \left \{h\left (\Big | \frac{S_v}{\sqrt{v}} \Big |-t\right) \right \}$$

Therefore,

$$\mathbb P\Big( \max_{1\le k\le n} \Big | \frac{S_k}{\sqrt{k}} \Big | \ge t\Big) = \mathbb P\Big( \bigcup_k \Big\{ \Big|\frac{S_k}{\sqrt{k}} \Big | \ge t \Big\} \Big) = EI_Z \le E\exp \left \{h\left (\Big | \frac{S_v}{\sqrt{v}} \Big |-t\right) \right \}$$

$$\Rightarrow \mathbb P\Big( \max_{1\le k\le n} \Big | \frac{S_k}{\sqrt{k}} \Big | \ge t\Big) \le e^{-ht} E\exp \left \{h \Big | \frac{S_v}{\sqrt{v}} \Big | \right \} \qquad [1] $$

By Hoeffding's lemma, for a random variable $Y$, with $EY=0,\;a\le Y \le b$ we have, for any real $\lambda$ $$ E\left (e^{\lambda Y} \right) \le \exp \left(\frac{\lambda ^2 (b-a)^2}{8} \right) \qquad [2]$$ In our case, we have $$\Big | \frac{S_v}{\sqrt{v}} \Big | = \frac{1}{\sqrt{v}}\Big |X_1 +...+ X_v\Big | \le \frac{1}{\sqrt{v}}\Big (\Big |X_1 \Big | +...+ \Big | X_v\Big | \Big) \le \frac{1}{\sqrt{v}}v = \sqrt{v}$$

the last inequality by the initial assumptions. Since $v\le n$ we have $$0\le \Big | \frac{S_v}{\sqrt{v}} \Big | \le \sqrt{n} \Rightarrow -E\Big(\Big | \frac{S_v}{\sqrt{v}} \Big |\Big) \le \Big | \frac{S_v}{\sqrt{v}} \Big | - E\Big(\Big | \frac{S_v}{\sqrt{v}} \Big |\Big) \le \sqrt{n} -E\Big(\Big | \frac{S_v}{\sqrt{v}} \Big |\Big) $$. We now have a variable with zero expected value and bounded. The length of the interval is $b-a = \sqrt{n} -E\Big(\Big | \frac{S_v}{\sqrt{v}} \Big |\Big) + E\Big(\Big | \frac{S_v}{\sqrt{v}} \Big |\Big) = \sqrt{n} $ and $\lambda = h $. Inserting these values into Hoeffding's lemma and simplifying we get

$$ E\exp \Big \{h\Big | \frac{S_v}{\sqrt{v}} \Big | - hE\Big(\Big | \frac{S_v}{\sqrt{v}} \Big |\Big)\Big \} \le \exp \left(\frac{nh^2}{8} \right) $$ $$\Rightarrow E\exp \Big \{h\Big | \frac{S_v}{\sqrt{v}} \Big | \Big \} \le \exp \left(\frac{nh^2}{8} \right)\exp\Big \{hE\Big(\Big | \frac{S_v}{\sqrt{v}} \Big |\Big)\Big \} \qquad [3]$$ Note that the exponential that moved to the right hand side does not contain any random quantities that's why the expected value has disappeared. From previously we have $$ \Big | \frac{S_v}{\sqrt{v}} \Big | \le \sqrt{n} \Rightarrow E\Big | \frac{S_v}{\sqrt{v}} \Big | \le \sqrt{n} \Rightarrow \exp\Big \{hE\Big(\Big | \frac{S_v}{\sqrt{v}} \Big |\Big)\Big \} \le \exp\Big \{h\sqrt{n}\Big \} \le \exp\Big \{h^2n\Big \} \; [4] $$

Inserting the RHS of $[4]$ into $[3]$ and back in $[1]$ we obtain

$$ \mathbb P\Big( \max_{1\le k\le n} \Big | \frac{S_k}{\sqrt{k}} \Big |\Big )= \mathbb P\Big( \Big | \frac{S_v}{\sqrt{v}} \Big | \ge t\Big) \le \exp \left(-ht +\frac{9nh^2}{8} \right) \qquad [5]$$ Minimizing the RHS over $h$ we obtain $h^* = \frac {4}{9n}t$ and inserting into $[5]$ we finally obtain

$$ \mathbb P\Big( \max_{1\le k\le n} \Big | \frac{S_k}{\sqrt{k}} \Big |\Big )=\mathbb P\Big( \Big | \frac{S_v}{\sqrt{v}} \Big | \ge t\Big) \le \exp\Big \{-\frac{2}{9n}t^2\Big \} \qquad [6] $$

...which is a bound related to the number of r.v.'s involved.

Best Answer

Related Solutions

[Math] Hoeffding’s inequality and learning.

Intuition needed for this question

A-priori:

A-posteriori:

Your specific setup

Your specific questions

Hoeffding’s inequality for conditional probability

Related Question