$L^1$ convergence kernel convolutions implies weak convergence of probability measures

measure-theoryprobabilityprobability theorystatisticsweak-convergence

I was reading a paper where they claim the following. Let $\mathscr{M}$ be the space of measures on compact $\mathfrak{X} \subset \mathbb{R}$. Let $\phi(\cdot – \theta)$ be the normal kernel with location $\theta$:

$$ f_P(x) = \int \phi (x – \theta) dP(\theta)$$

They claim the following:

The map $P \mapsto f_{P}$ is one-to-one, onto $\mathscr{F}$, where $\mathscr{F}=\left\{f_{P}: P \in \mathscr{M}\right\}$. Further $P_n \rightarrow P_0$ weakly if and only if $\left\|f_{P_n}-f_{P_0}\right\|_1 \rightarrow 0$.

For the part abount injectivity I found solution here: Is "Convolution operator" well-defined and injective?

Clearly, by the definition of $\mathscr{F}$ it is also onto. So on this part I agree. For the direction $\Rightarrow$ the claim follows upon noticing that the Gaussian kernel in one variable is bounded Lipschitz continuous (see Is the Gaussian density Lipschitz continuous?). Then an application of Portmanteau's Lemma yields for any fixed $x$:

$$ \int \phi dP_n \rightarrow \int \phi dP_0$$

Then using again the boundedness of the kernels and nonnegativity of our integrals, an application of Dominated Convergence Theorem yields: $\left\|f_{P_n}-f_{P_0}\right\|_1 \rightarrow 0$.

Now for the other direction, I am stuck. I tried to use Scheffe's Lemma to use later the Portmanteau's Lemma with the classical convergence $P_n \rightarrow P_0$ weakly iff $P_n(\theta) \rightarrow P_0 (\theta)$ at all continuity points $\theta$. Indeed the application of Scheffe's Lemma gives from $\left\|f_{P_n}-f_{P_0}\right\|_1 \rightarrow 0$ that:

$$\int \int \phi(x-\theta) dP_n(\theta) dx \rightarrow \int \int \phi(x-\theta) dP_0(\theta) dx$$

Then one could use Fubini's Theorem and the fact that for fixed $\theta$ it holds: $\int \phi(x-\theta) dx = 1$ since it is a Gaussian Kernel. However, it doesn't give the claim eventually.

I also tried to argue by contradiction but I did not manage to get anywhere useful. I also tried to use this knowledge:
Is this theorem an extension of Scheffé Lemma

Maybe you will find it helpful.

Q: Do you have any idea on how to tackle this problem? Or any reference that I could consult?

Best Answer

Let $\Phi$ denote the standard normal distribution and by $\phi$ its density (w.r.t. Lebesgue's measure). For any probability measure $P$ on $\mathbb{R}$, $f_P=\int \phi(x-y) P(dy)$ is the density function (w.r.t Lenesgue's measure) of the measure $\Phi*P$.

If $\|f_n-f_0\|_1\xrightarrow{n\rightarrow\infty}0$ then for any bounded continuous function $g$ on $\mathbb{R}$, $\Big|\int g(f_n-f_0)\,dx\Big|\leq\|g\|_\infty\|f_n-f_0\|_1\xrightarrow{n\rightarrow\infty}0$. This shows that $\Phi*P_n$ converges weakly to $\Phi*P_0$. Taking the Fourier transform $\widehat{\Phi}\hat{P_n}$ converges to $\widehat{\Phi}\hat{P_0}$ poitwise and so, $\hat{P_n}$ converges to $\hat{P_0}$ poitwise. This implies that $P_n$ converges to $P_0$ weakly.

Conversely, if $P_n$ converges weakly to $P$, then as $\phi$ of the is bounded and continuous, it follows that $$f_n(x)=\int \phi(x-y)P_n(dy)\xrightarrow{n\rightarrow\infty}\int \phi(y) P_0(dy)=f_0(y)$$ The functions $f_n$ and $f_0$ are the density functions of the measures $\Phi*P_n$ and $\Phi*P_0$ respectively and thus. Scheffé's theorem then implies that $\|f_n-f_0\|_1=\|\Phi*P_n-\Phi*P_0\|_{TV}\xrightarrow{n\rightarrow\infty}0$.

Related Question