A wrong law of large numbers for dependent variables

law-of-large-numbersprobability theoryrandom variables

Suppose we are given $Y, X_1, X_2,\ldots$ i.i.d. standard normal random variables and define
$$Z_i = \sqrt{\rho}Y + \sqrt{1-\rho}X_i$$
for some given $\rho\in[0, 1)$. The random variables $Z_i$ are not independent if $\rho > 0$. Fix some threshold $T\in\mathbb{R}$ and let $L_i$ take value $1$ if $Z_i < T$ and $0$ otherwise.

Below I will give a proof of the fact that a.s.
$$\lim_{n\to\infty}\frac{1}{n}\sum_{i=1}^nL_i = \Phi(T)$$
where $\Phi$ is the CDF of the standard normal distribution. However, I know that this result cannot be true: for example, if $\rho$ is very close to $1$ (full correlation) then intuitively we expect all $L_i$ to take value $1$ with probability $\Phi(T)$ and all of them to be $0$ with probability $1-\Phi(T)$.

Question: What am I doing wrong in my proof? What step or statement doesn't hold?

Proof

I will follow quite closely the easy proof in Section 7.2 of Probability with Martingales by D. Williams since the $L_i$ have finite moments. Writing $p=\Phi(T)$ and since $Z_i$ is standard normal, $L_i$ is Bernoulli with parameter $p$ and we have $E[L_i] = p$.

Now we look at the variables $Z_i$ conditioned on $Y$:
\begin{align}
P[Z_i\le z\mid Y] ={}&P[\sqrt{\rho}Y + \sqrt{1-\rho}X_i\le z\mid Y]\\
={}&P\left[X_i\le\frac{z-\sqrt{\rho}Y}{\sqrt{1-\rho}}\mid Y\right]\\
={}&\Phi\left(\frac{z-\sqrt{\rho}Y}{\sqrt{1-\rho}}\right).
\end{align}

In particular, we have that the variables $Z_i$ conditioned on $Y$ are i.i.d. (since the $X_i$ are). It follows that the variables $L_i$ conditioned on $Y$ also are i.i.d., they are Bernoulli, and
$$P[L_i=1\mid Y] = \Phi\left(\frac{T-\sqrt{\rho}Y}{\sqrt{1-\rho}}\right).$$
Now write $A_i = L_i – p$ and $T_n = \sum_{i=1}^nA_i$. Then
\begin{align}
E[T_n^4] ={}&E[E[T_n^4\mid Y]]\\
={}&E[nE[A_1^4\mid Y] + 3n(n-1)E[A_1^2A_2^2\mid Y]]\\
={}&nE[A_1^4] + 3n(n-1)E[A_1^2A_2^2]\\
\le{}&Kn^2
\end{align}

where in the first line we used the law of total expectation, in the second line we just used the fact that the conditional random variables are independent and proceeded as in the book, the third line is the law of total expectation again, and the last line holds for some $K$ since all moments are finite.

Now we can just follow the rest of the proof. We have that
$$E\left[\sum_{n=1}^\infty\left(\frac{T_n}{n}\right)^4\right]\le\sum_{n=1}^\infty K\frac{1}{n^2}<\infty$$
which implies
$$\sum_{n=1}^\infty\left(\frac{T_n}{n}\right)^4<\infty$$
a.s. which in turn implies that the sequence of terms a.s. goes to $0$ as $n$ goes to $\infty$, concluding the proof.

Best Answer

(Migrated from the comment)

For the computation of $E[T_n^4\mid Y]$ to make sense, $A_i$ should have zero conditional mean, which leads to

$$p=\Phi\left(\frac{T-\sqrt{\rho}Y}{\sqrt{1-\rho}}\right).$$

Using this choice, all the other computations now make sense, proving

$$ \lim_{n\to\infty} \frac{1}{n}\sum_{i=1}^{n}\mathbf{1}_{\{Z_i \leq T\}} = P(Z_1 \leq T \mid Y) = \Phi\left(\frac{T-\sqrt{\rho}Y}{\sqrt{1-\rho}}\right). $$

(Although an overkill, this may also be viewed as a result of the Birkhoff-Kintchin Theorem.)

Related Question