The example in your lecture is making a reference to convergence in distribution. Below, I try to go through some of the details of what this means.
A general definition
A sequence of random variables $X_1,X_2,\ldots$ converges in distribution to a limiting random variable $X_\infty$ if their associated distribution functions $F_n(x) = \mathbb P(X_n \leq x)$ converge pointwise to $F_\infty(x) = \mathbb P(X_\infty \leq x)$ for every point $x$ at which $F_\infty$ is continuous.
Note that this statement actually says nothing about the random variables $X_n$ themselves or even the measure space that they live on. It is only making a statement about the behavior of their distribution functions $F_n$. In particular, no reference or appeal to any independence structure is made.
The case at hand
In this particular case, the problem statement itself assumes that each of the $X_n$ have the same distribution function $F_n = F$. This is analogous to a constant sequence of numbers $(y_n)$. Certainly if $y_n = y$ for all $n$ then $y_n \to y$. In fact, we can "map" our convergence-in-distribution problem down to such a situation in the following way.
If we fix an $x$ and consider the sequence of numbers $y_n = F_n(x) = F(x)$, we see that $y_1,y_2,\ldots$ is a constant sequence and so, obviously, converges (to $F(x)$, of course). This holds for any $x$ we choose, and so the functions $F_n$ converge pointwise for every $x$ (in this case) to $F$.
To finish things off, we note that $F(x) = \mathbb P(X_1 \leq x) = \mathbb P(-X_1 \leq x)$ by the symmetry of the normal distribution, so $F$ is also the distribution of $-X_1$. Hence $X_n \to -X_1$ in distribution.
Some equivalent and related statements for this example
To perhaps clarify the meaning of this notion further, consider the following (true!) statements about convergence in distribution, all of which use the same sequence you've defined.
- $X_1, X_2,\ldots$ converges in distribution to $X_1$.
- Fix any $k$. $X_1, X_2,\ldots$ converges in distribution to $X_k$.
- Fix any $k$. $X_1, X_2,\ldots$ converges in distribution to $-X_k$.
- Define $Y_n = (-1)^n X_n$. Then, $Y_n \to X_1$ in distribution.
- Slightly trickier. Let $\epsilon_n$ be random variables such that $\epsilon_n$ is independent of $X_n$ (but, not necessarily of other $\epsilon_k$ or $X_k$) and taking the values $+1$ and $-1$ with probability 1/2, each. Define $Z_n = \epsilon_n X_n$. Then, the sequence $Z_1, Z_2,\ldots$ converges in distribution to $X_1$. This sequence also converges in distribution to $-X_1$ and $\pm X_k$ for any fixed $k$.
Explicit examples incorporating dependence
The easiest way to construct examples in which the $X_i$ are dependent is to use functions of a latent sequence of iid standard normals. The central limit theorem provides a canonical example. Let $Z_1,Z_2,\ldots$ be an iid sequence of standard normal random variables and take
$$X_n = n^{-1/2} \sum_{i=1}^n Z_i \>.$$
Then each $X_n$ is standard normal, so $X_n \to -X_1$ in distribution, but the sequence is obviously dependent.
Xi'an provided another nice (related) example in a comment (now deleted) to this answer. Let $X_n = (1-2 \mathbb I_{(Z_1+\cdots+Z_{n-1}\geq 0})) Z_n$ where $\mathbb I_{(\cdot)}$ denotes the indicator function. The necessary conditions are, again, satisfied.
Other such sequences can be constructed in a similar way.
An aside on the relationship to other modes of convergence
There are three other standard notions of convergence of random variables: almost-sure convergence, convergence in probability and $L_p$ convergence. Each of these are (a) "stronger" than convergence in distribution in the sense that convergence in any of these three implies convergence in distribution and (b) each of these three, in contrast to convergence in distribution, requires that the random variables at least be defined on a common measure space.
To achieve almost-sure convergence, convergence in probability or $L_p$ convergence, we often have to assume some additional structure on the sequence. However, something slightly peculiar happens in the case of a sequence of normally distributed random variables.
An interesting property of sequences of normal random variables
Lemma: Let $X_1,X_2,\ldots$ be a sequence of zero-mean normal random variables defined on the same space with variances $\sigma_n^2$. Then, $X_n \xrightarrow{\,p\,} X_\infty$ in probability if and only if $X_n \xrightarrow{\,L_2\,} X_\infty$, in which case $X_\infty \sim \mathcal N(0,\sigma^2)$ where $\sigma^2 = \lim_{n\to\infty} \sigma_n^2$.
The point of the lemma is three-fold. First, in the case of a sequence of normal random variables, convergence in probability and in $L_2$ are equivalent which is not usually the case. Second, no (in)dependence structure is assumed in order to guarantee this convergence. And, third, the limit is guaranteed to be normally distributed (which is not otherwise obvious!) regardless of the relationship between the variables in the sequence. (This is now discussed in a little more detail in this follow-up question.)
As indicated in the earlier comments, once you get a sample from the joint distribution of $(X_1,X_2,X_3)$,
$$(x_1^1,x_2^1,x_3^1),\ldots,(x_1^t,x_2^t,x_3^t)$$
the marginal sample
$$(x_1^1,x_2^1),\ldots,(x_1^t,x_2^t)$$is indeed a sample from the marginal joint distribution of $(X_1,X_2)$ and you can ignore the simulated $x_3^j$'s. They can however be useful in Monte Carlo evaluations through a technique called Rao-Blackwellisation since the average$$\frac{1}{t}\sum_{i=1}^t h(x_1^i,x_2^i)$$is improved by the average$$\frac{1}{t}\sum_{i=1}^t \mathbb{E}[h(x_1^i,x_2^i)|x_3^i]$$as a conditional expectation shares the same expectation as the original but reduces the variance. See for instance this discussion on Cross Validated.
Best Answer
The calculations in the question look correct, but care is needed because the distribution of $V_\mu$ is not continuous. (I will use $\mu$ instead of $m$ throughout.)
From first definitions we may find the distribution function (CDF) of $V_\mu$ is
$$F_\mu(x) = \Pr(V_\mu) \le x) = \sum_{n=0}^\infty x^n \Pr(N_\mu = n) = e^{\mu(x-1)}$$
provided $0 \le x \le 1$. For $x \gt 1$, $F_\mu(x) = 1$ of course. But for $x \lt 0$, necessarily $F_\mu(x) = 0$. Here is its graph when $\mu=1$ showing the jump at $x=0$:
The moment generating function, $\phi_\mu(t) = \mathbb{E}(\exp(t V_\mu))$, must be computed with similar care near zero. It can be obtained as a Lebesgue-Stieltjes integral,
$$\phi_\mu(t) = \int_\mathbb{R} e^{t x} dF_{\mu}(x)$$
via integration by parts as
$$\phi_\mu(t) = e^{t x} F_\mu(x) \vert_{-\infty}^1 - \int_0^1 t e^{t x} e^{\mu(x-1)}dx = e^t - t\frac{e^t - e^{-\mu}}{t+\mu}.$$
As a check, its McLaurin series begins
$$\phi_\mu(t) = 1 + \left(\frac{\mu-1+e^{-\mu}}{\mu}\right) t + \left(\frac{\mu^2 - 2\mu + 2 - 2e^{-\mu}}{\mu^2}\right)t^2/2 + \cdots$$
The constant term of $1$ shows the total probability mass is $1$. The next two terms will be useful in addressing the rest of the questions.