The example in your lecture is making a reference to *convergence in distribution*. Below, I try to go through some of the details of what this means.

**A general definition**

A sequence of random variables $X_1,X_2,\ldots$ **converges in distribution** to a limiting random variable $X_\infty$ if their associated distribution functions $F_n(x) = \mathbb P(X_n \leq x)$ converge *pointwise* to $F_\infty(x) = \mathbb P(X_\infty \leq x)$ for every point $x$ at which $F_\infty$ is continuous.

Note that this statement actually says *nothing* about the random variables $X_n$ themselves or even the measure space that they live on. It is *only* making a statement about the behavior of their distribution functions $F_n$. In particular, no reference or appeal to any independence structure is made.

**The case at hand**

In this particular case, the problem statement itself assumes that each of the $X_n$ have the same distribution function $F_n = F$. This is analogous to a constant sequence of numbers $(y_n)$. Certainly if $y_n = y$ for all $n$ then $y_n \to y$. In fact, we can "map" our convergence-in-distribution problem down to such a situation in the following way.

If we fix an $x$ and consider the sequence of *numbers* $y_n = F_n(x) = F(x)$, we see that $y_1,y_2,\ldots$ is a *constant* sequence and so, obviously, converges (to $F(x)$, of course). This holds for any $x$ we choose, and so the *functions* $F_n$ converge pointwise for every $x$ (in this case) to $F$.

To finish things off, we note that $F(x) = \mathbb P(X_1 \leq x) = \mathbb P(-X_1 \leq x)$ by the symmetry of the normal distribution, so $F$ is also the distribution of $-X_1$. Hence $X_n \to -X_1$ in distribution.

**Some equivalent and related statements for this example**

To perhaps clarify the meaning of this notion further, consider the following (true!) statements about convergence in distribution, all of which use the same sequence you've defined.

- $X_1, X_2,\ldots$ converges in distribution to $X_1$.
- Fix any $k$. $X_1, X_2,\ldots$ converges in distribution to $X_k$.
- Fix any $k$. $X_1, X_2,\ldots$ converges in distribution to $-X_k$.
- Define $Y_n = (-1)^n X_n$. Then, $Y_n \to X_1$ in distribution.
**Slightly trickier**. Let $\epsilon_n$ be random variables such that $\epsilon_n$ is independent of $X_n$ (but, not necessarily of other $\epsilon_k$ or $X_k$) and taking the values $+1$ and $-1$ with probability 1/2, each. Define $Z_n = \epsilon_n X_n$. Then, the sequence $Z_1, Z_2,\ldots$ converges in distribution to $X_1$. This sequence also converges in distribution to $-X_1$ and $\pm X_k$ for any fixed $k$.

**Explicit examples incorporating dependence**

The easiest way to construct examples in which the $X_i$ are dependent is to use functions of a latent sequence of iid standard normals. The **central limit theorem** provides a canonical example. Let $Z_1,Z_2,\ldots$ be an iid sequence of standard normal random variables and take
$$X_n = n^{-1/2} \sum_{i=1}^n Z_i \>.$$
Then each $X_n$ is standard normal, so $X_n \to -X_1$ in distribution, but the sequence is obviously dependent.

Xi'an provided another nice (related) example in a comment (now deleted) to this answer. Let $X_n = (1-2 \mathbb I_{(Z_1+\cdots+Z_{n-1}\geq 0})) Z_n$ where $\mathbb I_{(\cdot)}$ denotes the indicator function. The necessary conditions are, again, satisfied.

Other such sequences can be constructed in a similar way.

**An aside on the relationship to other modes of convergence**

There are three other standard notions of convergence of random variables: *almost-sure convergence*, *convergence in probability* and $L_p$ *convergence*. Each of these are (a) "stronger" than convergence in distribution in the sense that convergence in any of these three implies convergence in distribution and (b) each of these three, in contrast to convergence in distribution, requires that the random variables at least be defined on a common measure space.

To achieve almost-sure convergence, convergence in probability or $L_p$ convergence, we often have to assume some additional structure on the sequence. However, something slightly peculiar happens in the case of a sequence of normally distributed random variables.

**An interesting property of sequences of normal random variables**

**Lemma**: Let $X_1,X_2,\ldots$ be a sequence of zero-mean normal random variables defined on the same space with variances $\sigma_n^2$. Then, $X_n \xrightarrow{\,p\,} X_\infty$ in probability if and only if $X_n \xrightarrow{\,L_2\,} X_\infty$, in which case $X_\infty \sim \mathcal N(0,\sigma^2)$ where $\sigma^2 = \lim_{n\to\infty} \sigma_n^2$.

The point of the lemma is three-fold. First, in the case of a sequence of normal random variables, convergence in probability and in $L_2$ are equivalent which is not usually the case. Second, no (in)dependence structure is assumed in order to guarantee this convergence. And, third, the limit is guaranteed to be normally distributed (which is *not* otherwise obvious!) regardless of the relationship between the variables in the sequence. (This is now discussed in a little more detail in this follow-up question.)

## Best Answer

When I was an undergraduate, the professor in my probability class began each lecture by drawing two balls in succession (without replacement) from an urn that he brought to class. Some days, the first ball was white and the second ball black, while on other days, the first ball was black and the second ball was white. I noticed over the course of the semester that roughly half the time, the first ball was white and the second black, and half the time it was the other way around. So, I figured that the the probability that the first ball was white was $0.5$ and the probability that the second ball was white was also $0.5$.

A classmate of mine was always just a tad late coming to class and he observed only the second ball being drawn and he also noted that roughly half the time, the ball that our professor drew was white, and he too estimated the probability that the professor drew a white ball was $0.5$. He didn't know that the ball that our profossor was drawing as my friend walked in was the

secondball that the professor was drawing from the urn. And yet, my friend and I came up with the same estimate of the probability of the (second) ball being white.At the end of the semester, our professor invited the class to examine the urn. I was surprised to discover that the urn contained only one black ball and one white ball!

Thatexplained why the draws were always (white, black) or (black, white). By golly, those draws were dependent as heck but they both had the same marginal probability $0.5$ of resulting in a white ball both for me who saw both draws and for my classmate who didn't know that he was observing the result of the second draw from the urn.More generally, in sampling without replacement from a population of $n$ distinct items, suppose that we are taking $k < n$ samples. Then the $k$ samples are all distinct. Unknown to us, God continues sampling without replacement until all $n$ items have been draw. God's experiment has $n!$ different outcomes each of which has probability $\dfrac{1}{n!}$. How many of these outcomes have item #i occurring in the $j$-th place? Well, God's experiment has $n!$ possible outcomes of which exactly $(n-1)!$ outcomes have item #i in the $j$-th place (and the $n-1$ outcomes #1, #2, $\ldots$, #(i-1), #(i+1), #(i+2), $\ldots$, #n scattered about in places $1, 2, \ldots, (i-1), (i+1), \ldots n$. So, at least in God's mind, the probability that item #i occurs in the $j$-th place is $\dfrac{(n-1)!}{n!} = \dfrac 1n$

regardless of what $j$ is. In God's mind, item #i has thesameprobability $\dfrac 1n$ of occurring in each of the $n$ places. To the extent that we all hope to know what is in God's mind, we should accept these calculations as correct, even though we stopped after $k$ draws and didn't complete the experiment by drawing all $n$ items and so didn't get to see what God obtained in draws numbered $k+1, k+2, \cdots, n$.Note that the events that "item #i occurs in the $j$-th place"and "item #i occurs in the $j^\prime$-th place" are

disjointevents (the cannot occur simultaneously), not independent events.Very dependentbut nonetheless equally likely