[Math] Distance between distributions and distance of moments

pr.probabilityst.statistics

Let's say I have a sequence of random variables $X_n$ such that $$\mathbf E X_n^k = \mathbf E X^k+O(a_k/\sqrt{n})\quad\text{for all }k\in\mathbb N,\tag{$\ast$}$$ where $X$ is a random variable of standard (zero mean, unit variance) Gaussian distribution and $a_k$ are some constants which typically grow with $k$ (for my purposes they would be $a_k=(k/2)!$, for example).

Since the normal distribution is uniquely determined by its moments $(\ast)$ implies that $X_n\Rightarrow X$ weakly, as $n\to\infty$.

Is there an appropriate way to formalize the distance $d$ between the distributions of $X_n$ and $X$, such that $(\ast)$ implies $$d(X_n,X)=O(f(n))$$ for some function $f$? I think $(\ast)$ is not strong enough to control distance measures like the Kullback-Leibler divergence or the Hellinger distance, but there might be some appropriate weaker notion?

I think that $(\ast)$ implies $$\mathbf E f(X_n)=\mathbf E f(X)+O(1/\sqrt{n})$$ for a small class of test functions $f$, but depending on the growth of $a_k$, this class might be vary small.

Best Answer

The natural thing to compare in this context seems to be the moment generating functions of $X_n$ and $X$. In particular, consider: \begin{align*} \mathbf{E} \exp(t X_n) - \mathbf{E} \exp(t X) &= \sum_{k=0}^{\infty} \frac{t^k}{k!} \left( \mathbf{E} X_n^k - \mathbf{E} X^k \right) \\ &\le \frac{1}{\sqrt{n}} \sum_{k=0}^{\infty} \frac{t^k}{k!} (k/2)! = \frac{1}{\sqrt{n}} \left( 1 + e^{t^2/4} \sqrt{\pi} t + e^{t^2/4} \sqrt{\pi} t \operatorname{Erf}(t/2) \right) \end{align*} where we used the hypothesis given by the OP.

Given this bound on their moment generating functions (or a similar bound on the characteristic function), to what extent do the laws of $X_n$ and $X$ agree? There seems to be an interesting discussion about this in the statistics literature.

  • McCullagh, Peter. "Does the moment-generating function characterize a distribution?" The American Statistician 48.3 (1994): 208-208.
  • Waller, Lance A. "Does the characteristic function numerically distinguish distributions?" The American Statistician 49.2 (1995): 150-152.
  • LuceƱo, Alberto. "Further evidence supporting the numerical usefulness of characteristic functions." The American Statistician 51.3 (1997): 233-234.