Show that the $\chi^2$-distance between probability measures $\mu,\nu$ is equal to $\chi^2(\nu,\mu)=\sup_f\left|\int f\:{\rm d}(\nu-\mu)\right|^2$

chi squaredmeasure-theoryprobability theorysigned-measures

Let $(E,\mathcal E)$ be a measurable space, $\mu$ and $\nu$ be probability measures on $(E,\mathcal E)$ and $$\chi^2(\nu,\mu):=\begin{cases}\displaystyle\mu\left|\frac{{\rm d}\nu}{{\rm d}\mu}-1\right|^2=\mu\left|\frac{{\rm d}\nu}{{\rm d}\mu}\right|^2-1&\text{, if }\nu\ll\mu\\\infty&\text{, otherwise}\end{cases}$$ denote the $\chi^2$-distance of $\mu$ and $\nu$.

I want to show that $$\chi^2(\nu,\mu)=\sup_f\left|\int f\:{\rm d}(\nu-\mu)\right|^2,\tag1$$ where the supremum is taken over all bounded $\mathcal E$-measurable $f:E\to\mathbb R$ with $\left\|f\right\|_{L^2(\mu)}\le1$.

The case $\nu\not\ll\mu$ is clear to me. So, assume $\nu\ll\mu$ and let $$\varrho:=\frac{{\rm d}\nu}{{\rm d}\mu}.$$ I think we need to distinguish the cases $\varrho\in L^2(\mu)$ and $\varrho\not\in L^2(\mu)$. If $\varrho\in L^2(\mu)$, then $${\chi^2(\nu,\mu)}^{\frac12}=\left\|\varrho-1\right\|_{L^2(\mu)}=\sup_{\substack{f\in L^2(\mu)\\\left\|f\right\|_{L^2(\mu)}\le1}}|\langle\varrho-1,f\rangle_{L^2(\mu)}|\tag2$$ as this is true for any Hilbert space.

How can we conclude from $(2)$? I guess we need to argue with the density of bounded $\mathcal E$-measurable $f:E\to\mathbb R$ in $L^2(\mu)$.

And how can we show the claim in the case $\varrho\not\in L^2(\mu)$, where we've clearly got $\chi^2(\nu,\mu)=\infty$?

Best Answer

First let's deal with the case $\rho \in L^2(\mu)$. Let $B$ be the unit ball in $L^2(\mu)$ and say $f \in B$ is in $B_b$ if and only if $f$ is additionally bounded.

What remains to show once $(2)$ is established is that $$\sup_{f \in B} | \langle \rho - 1, f \rangle | = \sup_{f \in B_b} | \langle \rho - 1, f \rangle |.$$ Clearly the left hand side is at least as big as the right hand side. Conversely, let $f \in B$. Define $$f_N = \begin{cases} f \qquad & |f| \leq N \\ N \qquad & \text{otherwise} \end{cases}$$ Then for each $N \in \mathbb{N}$, $f_N \in B_b$. Additionally, it's a simple application of the DCT to check that $f_N \to f$ in $L^2(\mu)$ as $N \to \infty$. This implies that $\langle \rho - 1, f_N \rangle \to \langle \rho - 1, f \rangle$ as $N \to \infty$. Hence $|\langle \rho - 1, f \rangle| \leq \sup_{g \in B_b} |\langle \rho - 1, g \rangle|$ which proves the desired equality.

Now we deal with the case $\rho \not \in L^2(\mu)$. Then $\chi^2(\mu, \nu) = \infty$ so we want to show that $$\sup_{f \in B_b} |\langle \rho, f \rangle | = \infty.$$ Instead I will show the contrapositive. So suppose that supremum above is finite.

Let $B_b^+ = \{f \in B_b: f \geq 0\}$ and define $B^+$ analogously. Each of the following equalities is not too difficult to check. $$ \sup_{f \in B_b} | \langle \rho, f \rangle | = \sup_{f \in B_b^+} |\langle \rho, f \rangle| = \sup_{f \in B^+} | \langle \rho, f \rangle | = \sup_{f \in B} |\langle \rho, f \rangle | $$ For the second equality, you should use an argument using cut-offs like I did above. The difference is that this time DCT won't work since we don't know a priori that $\rho f \in L^1(\mu)$ for arbitrary $f \in B$. However, we've restricted attention to positive functions so the MCT will do the job.

One slight subtlety is that you need to prove that the integrals appearing in the $4$th supremum are well-defined. To do this, note that the first two equalities, combined with the assumption, imply that for $f \in B$, $|\langle \rho, |f| \rangle | < \infty$ so that $\rho f \in L^1(\mu)$.

It is then a well known exercise in functional analysis to see that $\sup_{f \in B} |\langle \rho, f \rangle | < \infty$ implies that $\rho \in L^2(\mu)$- for example, see here.

Related Question