Uniform Tightness of a Sequence of Probability Measures in $\mathbb R^{\mathbb N}$

measure-theoryprobabilityprobability theoryprobability-limit-theoremsreal-analysis

Consider a sequence $(P_n)_{n\in\mathbb N}$ of probability measures on $(\mathbb R^{\mathbb N}, \mathcal B)$, where $\mathbb R^{\mathbb N}$ is the countable Cartesian product of $\mathbb R$, and $\mathcal B$ is the $\sigma$-algebra generated by the topology of pointwise convergence, i.e. the topology generated by the metric $$d(x,y) = \sum_{n=1}^\infty \frac{\min(1,\vert x_n – y_n\vert)}{2^n}.$$ The corresponding metric space is complete, and the generated topology is separable. Therefore, every probability measure on $(\mathbb R^{\mathbb N},\mathcal B)$ is tight.

Let $P$ be a (tight) probability measure on $(\mathbb R^{\mathbb N},\mathcal B)$. We have the following theorem:

Theorem 1: $(P_n)_{n\in\mathbb N}$ converges to $P$ iff $(\pi_k\#P_n)_{n\in\mathbb N}$ converges to $\pi_k\#P$ for all $k\in\mathbb N$.

Proof. (See Example 2.6 in Billingsley "Convergence of probability measures" (2nd edition).)

Here $\pi_k$ denotes the coordinate projection of $\mathbb R^{\mathbb N}$ onto $\mathbb R^k$ and $\pi_k\#P$ the pushforward probability measure of $P$ under $\pi_k$.

Now suppose that $(\pi_k\#P_n)_{n\in\mathbb N}$ converges to $\pi_k\#P$ for all $k\in\mathbb N$. By Theorem 1, $(P_n)_{n\in\mathbb N}$ converges to $P$. Since $P_n$ is tight for every $n\in\mathbb N$, and $P$ is also tight, a Theorem due to Le Cam yields that $(P_n)_{n\in\mathbb N}$ is uniformly tight. (Is this correct?)

I would like to prove the "$\Leftarrow$"-direction of Theorem 1 just by using tightness + weak convergence of finte dimensional marginals. To this end, I would have to show uniform tightness of $(P_n)_{n\in\mathbb N}$. However, I don't know how to approach this. What we have:

  • $P_n$ is tight for each $n\in\mathbb N$
  • $\pi_k\#P_n$ converges to $\pi_k\#P$ for all $k\in\mathbb N$

To show: for all $\epsilon>0$ there is $K\subset\mathbb R^{\mathbb N}$ compact such that $$\sup_{n\in\mathbb N}P_n(\mathbb R^{\mathbb N}\setminus K) \leq \epsilon.$$

My idea: for any $k\in\mathbb N$, we have that $\pi_k\#P_n$ converges to $\pi_k\#P$ by assumption. Hence, for any $\epsilon>0$ there is $A\subset\mathbb R^k$ compact such that $$\sup_{n\in\mathbb N}\,(\pi_k\#P_n)(\mathbb R^k\setminus A) \leq \epsilon.$$ I thought that the considered topology over $\mathbb R^{\mathbb N}$ allows to somehow "embed" a compact set $K$ in $\pi_k^{-1}(A)$ in the sense that $$P_n(\mathbb R^{\mathbb N}\setminus K)\leq P_n(\pi_k^{-1}(\mathbb R^k\setminus A)) = (\pi_k\#P_n)(\mathbb R^k\setminus A) < \epsilon.$$ I don't know how to make it rigorouos though.

Edit 2: In an earlier version of this post, my proof idea had two fundamental flaws. First, I used the wrong characterization of compact sets in $\mathbb R^{\mathbb N}$ and second, I claimed that $\pi_k^{-1}(A)$ can be a subset of a compact set, which is not true as $pi_k^{-1}(A)$ is the pre-image of a closed set under a continuous function, and hence also closed. If $\pi_k^{-1}(A)$ was a closed subset of a compact set, it would be compact itself, but $\pi_k^{-1}(A)$ is clearly not compact. Therefore, I have to give up on the idea mentioned above. I will add a bounty to this question to draw some attention to it as I need some more clues on how to proceed.

Best Answer

For reference, another proof of the fact that OP is referring to can be found in Bogachev Vol. II, Example 8.2.16, which however does not answer OP's question. OP is asking to show that in this case, fdd convergence implies tightness and so $X^n\Rightarrow X$.


The following is an attempt to answer OP's question so all feedback is appreciated.

Since we have fdd convergence we also have that $\max_{k\leq N}|X_k^n|\to^d \max_{k\leq N}|X_k|$ for any fixed $N$ by continuous mapping - this implies the convergence of the cdfs of $\max_{k\leq N}|X_k^n|$ at the continuity points of the cdf of $\max_{k\leq N}|X_k|$. Therefore, for any chosen fixed $\varepsilon$, we can choose $K_\varepsilon$ large enough s.t. $P(\max_{k\leq N}|X_k|>K_\varepsilon)<\varepsilon/2$, and for all $n$ large enough we have $P(\max_{k\leq N}|X_k^n|>K_\varepsilon)<\varepsilon$.

We have then that for each $\varepsilon>0$ and $N\in \mathbb{N}$, there are $N_0\in \mathbb{N}$ and $K_{\varepsilon,N}> 0$ s.t. $P(\max_{k\leq N}|X_k^n|>K_{\varepsilon,N})<\varepsilon 2^{-N},\forall n\geq N_0$. Now note that $P(\max_{k\leq N}|X_k^n|>K_{\varepsilon,N})=P(X^n\notin A_{N,\varepsilon})$ where we define $A_{N,\varepsilon}=\{x=(x_1,x_2,...):\max_{k\leq N}|x_k|\leq K_{\varepsilon,N}\}$.

We now note that for each $\varepsilon>0$ and $N\in \mathbb{N}$ we can just choose $N_0=1$ and thus yield $\sup_n P(X^n\notin A_{N,\varepsilon})<\varepsilon$. This is because $(X^n)_{n\leq N_0}$ satisfies $\sup_{n\leq N_0}P(\max_{k\leq N}|X_k^n|>K_{\varepsilon,N}')<\varepsilon$ for $K'_{\varepsilon,N}$ large enough so we choose the greater between $K'_{\varepsilon,N}$ and $K_{\varepsilon,N}$ for the set $A_{N,\varepsilon}$. We also see that $P(X^n\notin \cap_NA_{N,\varepsilon})\leq \sum_NP(X^n\notin A_{N,\varepsilon})\leq \varepsilon \sum_N2^{-N}=\varepsilon$, so $\sup_nP(X^n\notin \cap_NA_{N,\varepsilon})<\varepsilon$.

It remains to argue that $\cap_NA_{N,\varepsilon}$ is compact in the topology described by OP. Indeed, if $(x^n)_n$ is a sequence of elements in $\cap_NA_{N,\varepsilon}$, then each sequence of coordinates is uniformly bounded, and thus has a convergent subsequence - from this (look here) it follows that $(x^n)_n$ has a subsequence convergent in $\cap_NA_{N,\varepsilon}$ pointwise.