Exact meaning of uniform integrability for empirical distributions

functional-analysisprobability theoryreal-analysis

Suppose we have $n$ non-negative integer-valued random variables $X_1,\ldots,X_n$ and consider the empirical distribution $$Q := \frac{1}{n}\sum_{i=1}^n \delta_{X_i}.$$ We equip any probability mass function $q \in \mathcal{P}(\mathbb{Z}_+)$ with the usual $\ell^1$ norm $\|q\|:= \sum_{k=0}^\infty q_k$. I am confused on the precise meaning of the statement that the empirical distribution $Q$ is uniformly integrable (notice that $Q$ depends on $n$). In the typical setting, we say that a collection of random variables $(X_n~\colon n \in \mathbb N)$ is U.I. (Uniformly Integrable) if there exists some $K >0$ so that $\sup_{n\in \mathbb N} \mathbb{E}(|X_n|\mathbf{1}_{|X_n| \geq K}) < \epsilon$ for each pre-fixed $\epsilon > 0$. But how does this general definition translates into the aforementioned setting really bothers me… Thanks for any help!

Best Answer

I'm going to reference arxiv.org/pdf/1804.04608.pdf, which was cited in the comments as the motivation for this question. The paper uses the following notation:

  • $\Omega = \{(\eta_i)_{i=1,\dots,n}\in \mathbb{Z}_+: \sum_{i=1}^n \eta_i = m\}$. Because we vary $m$ and $n$, I'm going to instead write $m_n$ and $\Omega^m_n$. The paper seems to be implying that $(\eta_i)_{i\in\mathbb{N}}$ is some fixed sequence.
  • $Q(0) = \frac{1}{n}\sum_{i=1}^n \delta_{\eta_i}$ is the empirical measure. Again, I will write $Q_n = \frac{1}{n}\sum_{i=1}^n \delta_{\eta_i}$ for clarity, omitting the $(0)$ as this question does not regard the dynamics of the process introduced in the paper. For each $n$, we regard $Q_n$ as an element of $\ell^1(\mathbb{Z}_+)$.
  • $Q_n \to q$ in $\ell^1(\mathbb{Z}_+)$, $\frac{m_n}{n} \to \rho$ and $\lambda = \sum_{k=0}^\infty kq_k$.

We ignore the dynamics introduced in the paper. We can clearly see that each $Q_n$ is a probability measure (I'm assuming the configuration $(\eta_i)_{i\in\mathbb{N}}$ is fixed and non-random. If I misread that, then the argument only changes slightly). Furthermore, convergence in $\ell^1$ is equivalent to total variation convergence, so $q$ is also a probability measure. So, we can assign to each $n$ a random variable $Y_n := \eta_{U_n}$ where $U_n$ is sampled from $\{1,\dots,n\}$ uniformly at random. Then $Y_n$ has distribution $Q_n$. Let $Y$ be some random variable sampled from $q$. Then $Y_n \to Y$ in total variation. For each $n$, $\mathbb{E}[Y_n] = \frac{1}{n}\sum_{i=1}^n \eta_i = \frac{m_n}{n} \to \rho$, and $\mathbb{E}[Y] = \sum_{k=1}^\infty kq_k = \lambda$.

The paper states that $\rho = \lambda$ if $Q(0)$ is uniformly integrable. What they mean is that if the sequence of random variables $\{Y_n\}_{n \in \mathbb{N}}$ is uniformly integrable, then total variation convergence implies convergence in expectation so that,

$$\rho = \lim_{n\to\infty} \frac{m_n}{n} = \lim_{n\to\infty}\mathbb{E}[Y_n] = \mathbb{E}[Y] = \lambda.$$

The precise definition of this uniform integrability is,

$$\lim_{K\to\infty} \sup_{n \in \mathbb{N}} \mathbb{E}[Y_n\mathbb{I}_{Y_n > K}] = \lim_{K\to\infty} \sup_n \frac{1}{n}\sum_{i=1}^n \eta_i\mathbb{I}_{\eta_i > K} = 0.$$

I tried to come up with a nice counterexample, but I can't think of one where $Q_n$ are not uniformly integrable and it converges to a measure $q$ in total variation. I'd be interested to see if anyone has such a counter example. I suspect that for most if not all counterexamples, $\rho = \infty$ and $\lambda < \infty$.