Let $X_1, X_2, \dots, X_n \sim \operatorname{Bernoulli}(p)$ and $\bar{X}=\frac{1}{n}\sum\limits_{i=1}^nX_i$. Find an upper bound for $\mathbb{P}(|\bar{X}-p| > \epsilon) \forall \epsilon > 0$. Note that random variables are not independent.
$\textbf{What I've tried:}$
We know that $\mathbb{E}[\bar{X}] = \mathbb{E}[X_1] = p$. So we can use Chebyshev's inequality because $\mathbb{P}(|\bar{X}-p| > \epsilon) = \mathbb{P}(|\bar{X}-\mathbb{E}[\bar{X}]| > \epsilon) \le \mathbb{P}(|\bar{X}-\mathbb{E}[\bar{X}]| \ge \epsilon) \stackrel{Chebyshev}{\le} \frac{\operatorname{Var}(\bar{X})}{\epsilon^2} \le \frac{\operatorname{Var}(\sum\limits_{i=1}^n X_i)}{n^2\epsilon^2} \le \frac{n \sum\limits_{i=1}^n \operatorname{Var}(X_i)}{n^2\epsilon^2} \le \frac{n^2 \operatorname{Var}(X_1)}{n^2\epsilon^2} = \frac{p(1-p)}{\epsilon^2}$
So, $\frac{p(1-p)}{\epsilon^2}$ is an upper bound.
Is my proof true? Thanks!
Best Answer
Your calculated upper bound is an upper bound and your argument looks reasonable, but it is not the tightest possible.
If the $X_i$ can have any dependency, with large enough $n$ you can construct $\bar X$ to approximate any distribution on $[0,1]$ which has expectation $p$. There are three ranges of values of $\epsilon$ to consider:
$0< \epsilon < \min(p, 1-p)$: you can concentrate the distribution of $\bar X$ outside the interval $(p - \epsilon, p+\epsilon)$ so you can only say (as with any probability) $$\mathbb{P}(|\bar{X}-p| > \epsilon) \le 1$$
$\max(p, 1-p) < \epsilon$: in which case $p - \epsilon < 0$ and $p + \epsilon > 1$ and so you can say $$\mathbb{P}(|\bar{X}-p| > \epsilon) =0$$
$\min(p, 1-p) < \epsilon < \max(p, 1-p)$: you are only interested in one tail and if $0 < p< \frac12$ then $\mathbb{P}(|\bar{X}-p| > \epsilon) = \mathbb{P}(\bar{X}>p +\epsilon) \le \frac{\mathbb E[\bar X]}{p+\epsilon}= \frac{p}{p+\epsilon}$ using Markov's inequality, and similarly by symmetry if $\frac12 < p < 1$ then $\mathbb{P}(|\bar{X}-p| > \epsilon) = \mathbb{P}(\bar{X}< 1-p -\epsilon) \le \frac{1-p}{1-p+\epsilon}$, which you can combine to say $$\mathbb{P}(|\bar{X}-p| > \epsilon) \le \frac{1}{1+\frac{\epsilon}{\min(p,1-p)}}$$
As an illustration, here are the two upper bounds shown for $p=\frac13$, with yours in red and mine in blue.
Jessie asked for the code to draw this. Using R: