Solved – Convergence of identically distributed normal random variables

convergencenormal distributionrandom variable

I had this example in my machine learning lecture.

Let $X_2,\ldots,X_n$ be identically distributed (but not independent) copies of $X_1$ drawn from $\mathcal N(0,1)$. Then $X_n$ converges to $Y = -X_1$ as $n \rightarrow \infty$.

There isn't any explanation for why this sequence of random variables converges to $Y$.

Convergence in the sense

$\lim_{n \rightarrow \infty} F_{X_n}(t)$ where $F_{X_n}$ is the CDF of $X_n$.

Can anyone help me out?

Best Answer

The example in your lecture is making a reference to convergence in distribution. Below, I try to go through some of the details of what this means.

A general definition

A sequence of random variables $X_1,X_2,\ldots$ converges in distribution to a limiting random variable $X_\infty$ if their associated distribution functions $F_n(x) = \mathbb P(X_n \leq x)$ converge pointwise to $F_\infty(x) = \mathbb P(X_\infty \leq x)$ for every point $x$ at which $F_\infty$ is continuous.

Note that this statement actually says nothing about the random variables $X_n$ themselves or even the measure space that they live on. It is only making a statement about the behavior of their distribution functions $F_n$. In particular, no reference or appeal to any independence structure is made.

The case at hand

In this particular case, the problem statement itself assumes that each of the $X_n$ have the same distribution function $F_n = F$. This is analogous to a constant sequence of numbers $(y_n)$. Certainly if $y_n = y$ for all $n$ then $y_n \to y$. In fact, we can "map" our convergence-in-distribution problem down to such a situation in the following way.

If we fix an $x$ and consider the sequence of numbers $y_n = F_n(x) = F(x)$, we see that $y_1,y_2,\ldots$ is a constant sequence and so, obviously, converges (to $F(x)$, of course). This holds for any $x$ we choose, and so the functions $F_n$ converge pointwise for every $x$ (in this case) to $F$.

To finish things off, we note that $F(x) = \mathbb P(X_1 \leq x) = \mathbb P(-X_1 \leq x)$ by the symmetry of the normal distribution, so $F$ is also the distribution of $-X_1$. Hence $X_n \to -X_1$ in distribution.

Some equivalent and related statements for this example

To perhaps clarify the meaning of this notion further, consider the following (true!) statements about convergence in distribution, all of which use the same sequence you've defined.

  • $X_1, X_2,\ldots$ converges in distribution to $X_1$.
  • Fix any $k$. $X_1, X_2,\ldots$ converges in distribution to $X_k$.
  • Fix any $k$. $X_1, X_2,\ldots$ converges in distribution to $-X_k$.
  • Define $Y_n = (-1)^n X_n$. Then, $Y_n \to X_1$ in distribution.
  • Slightly trickier. Let $\epsilon_n$ be random variables such that $\epsilon_n$ is independent of $X_n$ (but, not necessarily of other $\epsilon_k$ or $X_k$) and taking the values $+1$ and $-1$ with probability 1/2, each. Define $Z_n = \epsilon_n X_n$. Then, the sequence $Z_1, Z_2,\ldots$ converges in distribution to $X_1$. This sequence also converges in distribution to $-X_1$ and $\pm X_k$ for any fixed $k$.

Explicit examples incorporating dependence

The easiest way to construct examples in which the $X_i$ are dependent is to use functions of a latent sequence of iid standard normals. The central limit theorem provides a canonical example. Let $Z_1,Z_2,\ldots$ be an iid sequence of standard normal random variables and take $$X_n = n^{-1/2} \sum_{i=1}^n Z_i \>.$$ Then each $X_n$ is standard normal, so $X_n \to -X_1$ in distribution, but the sequence is obviously dependent.

Xi'an provided another nice (related) example in a comment (now deleted) to this answer. Let $X_n = (1-2 \mathbb I_{(Z_1+\cdots+Z_{n-1}\geq 0})) Z_n$ where $\mathbb I_{(\cdot)}$ denotes the indicator function. The necessary conditions are, again, satisfied.

Other such sequences can be constructed in a similar way.

An aside on the relationship to other modes of convergence

There are three other standard notions of convergence of random variables: almost-sure convergence, convergence in probability and $L_p$ convergence. Each of these are (a) "stronger" than convergence in distribution in the sense that convergence in any of these three implies convergence in distribution and (b) each of these three, in contrast to convergence in distribution, requires that the random variables at least be defined on a common measure space.

To achieve almost-sure convergence, convergence in probability or $L_p$ convergence, we often have to assume some additional structure on the sequence. However, something slightly peculiar happens in the case of a sequence of normally distributed random variables.

An interesting property of sequences of normal random variables

Lemma: Let $X_1,X_2,\ldots$ be a sequence of zero-mean normal random variables defined on the same space with variances $\sigma_n^2$. Then, $X_n \xrightarrow{\,p\,} X_\infty$ in probability if and only if $X_n \xrightarrow{\,L_2\,} X_\infty$, in which case $X_\infty \sim \mathcal N(0,\sigma^2)$ where $\sigma^2 = \lim_{n\to\infty} \sigma_n^2$.

The point of the lemma is three-fold. First, in the case of a sequence of normal random variables, convergence in probability and in $L_2$ are equivalent which is not usually the case. Second, no (in)dependence structure is assumed in order to guarantee this convergence. And, third, the limit is guaranteed to be normally distributed (which is not otherwise obvious!) regardless of the relationship between the variables in the sequence. (This is now discussed in a little more detail in this follow-up question.)