Concentration result from Delta method

asymptoticsconcentration-of-measureinequalityprobability distributionsprobability theory

Let $X_1,X_2, \cdots$ be iid random variables with finite second moment.
Let $\bar{X_n} = (1/n)\sum_{i=1}^{n} X_i$ and $\bar{X^2_n} = (1/n)\sum_{i=1}^{n} X_i^2$.

Define $S_n = \bar{X^2_n} – (\bar{X_n})^2$. From Statistics literature (application of WLLN and continuous-mapping theorem), we know that $$ S_n \rightarrow^P E X_1^2 – (E X_1)^2 = Var(X_1). $$

Question: Can we get a concentration bound? That is, something like for some constant $c>0$ $$ Pr(|S_n – Var(X_1)| \geq \epsilon) \leq e^{-c \epsilon^2} $$ or $$ Pr(|S_n – Var(X_1)| \geq \frac{\sqrt{\log(n)}}{\sqrt{c}} ) \leq \frac{1}{n} ? $$

From this question thread, I am confident that the asymptotic rate is in fact $\sqrt{n}$ from Delta method. But I am not sure how to proceed with for getting a concentration bound.

I'll be really grateful for any kind of help to address this!

Best Answer

You can get a bound from Markov's inequality, but that's about it. In particular, there is no hope of getting such a strong bound (which is called sub-Gaussian concentration) without much stronger assumptions than just two moments. To see why, suppose that X is a random variable for which $$\mathbb{P}(|X| > t) \leq e^{-ct^2}$$ holds for all $t>0$. Then, using the usual trick for computing moment estimates from tail estimates \begin{align*} \mathbb{E}[e^{\frac{c}{2}X^2}] &= 1 + \int_0^\infty c t e^{\frac{c}{2}t^2}\mathbb{P}(|X|>t)dt \\ &\leq 1+ \int_0^\infty c t e^{\frac{c}{2}t^2}e^{-ct^2} dt = 1+ \int_0^\infty c te^{-\frac{c}{2}t^2} dt <\infty. \end{align*}

Now, let $c>0$ be any number and suppose that the $X_i$ are i.i.d. Exponential(1) random variables. Note that these random variables are much nicer than typical random variables with 2 moments -- they have finite exponential moments for powers less than 1. Even with these nice random variables, this fails already when $n=2$: \begin{align*} \mathbb{E}[e^{\frac{c}{2} (S_2 - 1)^2}] &= \int_{0}^\infty \int_0^\infty e^{\frac{c}{2}\left[\frac{1}{2}(x^2+y^2) - (\frac{1}{2}(x+y))^2 -1 \right]^2} e^{-(x+y)}dxdy = \infty. \end{align*} The same computation works for all $n \geq 2$.

In general, concentration bounds are basically the same thing as moment bounds. You can't hope for much better concentration than you put in with your hypotheses. So what does the hypothesis of having two finite moments buy you in this case? By Markov's inequality, there is a bound of the form

$$ \mathbb{P}(|S_n - Var(X_1)|>t) \leq \frac{\mathbb{E}[|S_n - Var(X_1)|]}{t}. $$

I'll sketch how to show that you cannot significantly improve on this. The same argument as above shows that if there exists constants $c,C,\alpha>0$ so that for all $t>c$ $$ \mathbb{P}(|X|>t) \leq C t^{-\alpha} \implies \mathbb{E}[|X|^\beta] <\infty $$ whenever $\beta < \alpha$. So now let's consider for $\epsilon>0$ i.i.d. random variables with density \begin{align*} f_{X_i}(t) = \begin{cases} \frac{1}{2+\epsilon}t^{-(3+\epsilon)} & t>1 \\ 0 & t \leq 1 \end{cases} \end{align*} Call $\sigma^2 = Var(X_1)$. We can compute that \begin{align*} \mathbb{E}[|S_2-\sigma^2|^{1+\epsilon}] &= \int_1^\infty \int_1^\infty \left|\frac{1}{2}(x^2+y^2) - (\frac{x}{2}+\frac{y}{2})^2-\sigma^2\right|^{1+\epsilon}\frac{1}{x^{2+\epsilon}y^{2+\epsilon}} \frac{1}{(1+\epsilon)^2}dxdy \\ &= \int_1^\infty \int_1^\infty \left|\frac{(x-y)^2}{4} -\sigma^2\right|^{1+\epsilon}\frac{1}{x^{3+\epsilon}y^{3+\epsilon}} \frac{1}{(2+\epsilon)^2}dxdy =\infty. \end{align*} To see that this is infinite, note that the inner integral can be seen to be infinite by limit comparison with $$ \frac{x^{2(1+\epsilon)}}{x^{3+\epsilon}} = x^{-(1-\epsilon)}. $$ For $\alpha>1$, this rules out any bound of the form \begin{align*} \mathbb{P}(|S_n-Var(X_1)| > t) \leq \frac{C}{t^{\alpha}} \end{align*} holding for all sufficiently large $t$ under these hypotheses. More complicated examples can be constructed to rule out weaker bounds as well.

Let me close by commenting on a common point of confusion. Convergence in distribution tells you essentially nothing about the regularity of the pre-limit objects. For example, let $X$ and $Y$ be independent N(0,1) and standard Cauchy random variables respectively and set $Z_n = (1-e^{-n})X + e^{-n} Y.$ $Z_n$ converges exponentially fast to a $N(0,1)$ random variable surely (never mind in distribution), but it doesn't have a finite mean.

Related Question