Decomposition of Total Variation Distance

conditional probabilityprobability theorytotal-variation

Let $X$, $Y$, and $Z$ be three random variables on a common measurable space.
I am interested in the total variation distance between the two joint distributions $(X,Z)$ and $(Y,Z)$, i.e.
\begin{equation} \label{eq:tv-distance}
d_{TV} \left( (X, Z) , (Y, Z) \right)
\quad (\dagger)
\end{equation}

I am looking for a "law of total probability" for the total variation distance which may be used to decompose the total variation distance above into the total variation distance between the conditional distributions $X \mid Z = z$ and $Y \mid Z = z$.
In particular, define the function $h: \mathbb{R} \to \mathbb{R}$ as
$$
h(z) = d_{TV} \left( X \mid Z = z , Y \mid Z = z \right)
\enspace.
$$

Is there a "law of total probability" way to decompose the total variation $(\dagger)$ to the conditional total variation distance function $h(z)$ and the distribution of $Z$? Say, for example,
$$
d_{TV} \left( (X, Z) , (Y, Z) \right)
= \mathbb{E} \left[ h(Z) \right]
$$

Best Answer

Yes, there is. For simplicity, I am focusing on the discrete case, but this will/should generalize to continuous distributions (given suitable measurability assumptions, etc.).

We have $$\begin{align*} \mathrm{d}_{\rm TV}((X,Z),(Y,Z)) &= \frac{1}{2}\sum_{v,z} \left|\Pr[ X=v, Z=z ]-\Pr[ Y=v, Z=z ]\right| \\ &= \frac{1}{2}\sum_{v,z} \left|\Pr[ X=v \mid Z=z ]-\Pr[ Y=v \mid Z=z ]\right|\cdot \Pr[Z=z] \\ &=\sum_z\Pr[Z=z]\sum_{v} \frac{1}{2}\left|\Pr[ X=v \mid Z=z ]-\Pr[ Y=v \mid Z=z ]\right| \\ &=\sum_z\Pr[Z=z]\mathrm{d}_{\rm TV}(X\mid Z=z,Y\mid Z=z) \\ &= \mathbb{E}_Z[h(Z)] \end{align*}$$ where $h\colon \mathbb{R}\to\mathbb{R}$ is defined as in your post, $h(z) = \mathrm{d}_{\rm TV}(X\mid Z=z,Y\mid Z=z)$.