Weak LLN for U-Statistics

convergence-divergenceprobabilityprobability theoryreal-analysis

Question

Let $(X_n)_{n\geq 1}$ be an i.i.d sequence of random varaibles such that $EX_1=\mu$ and $\sigma^2=\text{Var}(X_1)<\infty$. Then
$$
U_n=\binom{n}{2}^{-1}\sum_{1\leq i<j\leq n} X_iX_j\to\mu^2
$$

in probability as $n\to \infty$.

My attempt

I was able to prove this claim in the case that $\mu=0$. Indeed in that case we can use Chebeshev's inequality to deduce that
$$
P(|U_n|>\varepsilon)\leq \frac{n(n-1)\sigma^4}{2\varepsilon^2\binom{n}{2}^2}\to 0
$$

as $n\to \infty$ for each $\varepsilon>0$.

Problem

I am having difficulty proving the claim when the random variables are not centered. I tried to applied Chebeshev in general, since
$$
EU_n=\frac{2}{n(n-1)}\frac{n(n-1)}{2}\mu^2=2\mu^2
$$

but I am unable to compute the variance of $U_n$ fully. I know that
$$
\begin{align}
\text{Var}(U_n)&=\binom{n}{2}^{-2}\text{Cov}\left(\sum_{1\leq i<j\leq n }X_i X_j, \sum_{1\leq i<j\leq n} X_i X_j\right)\\
&=\binom{n}{2}^{-2}\left(\binom{n}{2}(2\mu^{2}\sigma^{2}+\sigma^4)+\dotsb\right).
\end{align}
$$

The term in the sum represents covariance of the form $\text{Cov}(X_iX_j, X_jX_i)$. All covariances of the form $$\text{Cov}(X_iX_j, X_kX_l)$$ with $i,j,k, l$ distinct will contribute zero. I am having trouble with counting and computing covariances of the form $$\text{Cov}(X_iX_j, X_kX_l)$$ where exactly one of $k$ or $l$ is $i$ or $j$. Any help is appreciated.

Best Answer

For the expectation, it should be$$ EU_n=\frac{2}{n(n-1)}\frac{n(n-1)}{2}\mu^2=\mu^2 $$

We first compute

\begin{align} Cov(X_1X_2,X_1X_2) &=E(X_1^2X_2^2)-E(X_1X_2)^2\\ &=(\sigma^2+\mu^2)^2 - \mu^4\\ &= \sigma^4+2\sigma^2\mu^2 \end{align}

We also have

\begin{align} Cov(X_1X_2,X_2X_3) &=E(X_1X_2^2X_3)-E(X_1X_2)E(X_2X_3)\\ &=\mu^2(\sigma^2+\mu^2) - \mu^4\\ &= \sigma^2\mu^2 \end{align}

and

\begin{align} Cov(X_1X_2,X_3X_4) &=E(X_1X_2X_3X_4)-E(X_1X_2)E(X_3X_4)\\ &=0 \end{align}

Hence, we just need to compute those terms that involve overlapping index. For the first case, there are $\binom{n}{2}$ of them and for the second case, there are $2n \cdot \binom{n-1}2 $ of them.

$$Pr(|U_n-\mu^2|> \epsilon) \le \frac{\binom{n}2(\sigma^4+2\sigma^2\mu^2) + 2n \cdot \binom{n-1}2(\sigma^2\mu^2)}{\binom{n}{2}^2\epsilon^2}$$

Notice the magnitude of the denominator is of order $n^4$ and the magnitude of the numerator is of order $n^3$.

Hence $U_n$ converges in probability to $\mu^2$.

Related Question