The multivariate normal distribution and its calculation

improper-integralsnormal distributionprobabilityprobability distributions

Let a random vector $X = (X_1, X_2,\ldots, X_p)^{\mathrm{T}}$ has the multivariate normal distribution with mean vector $\mu_1$ and covariance matrix $\Sigma_1 > 0$, i.e. $X \sim \mathcal{N}_p(\mu_1, \Sigma_1)$, and has the density $f_1(x)$, $x \in \mathbb{R}^p$.
Let a random vector $Y = (Y_1, Y_2,\ldots, Y_p)^{\mathrm{T}}$ has the multivariate normal distribution with mean vector $\mu_2$ and covariance matrix $\Sigma_2 > 0$, i.e. $Y \sim \mathcal{N}_p(\mu_2, \Sigma_2)$, and has the density $f_2(y)$, $y \in \mathbb{R}^p$.

I encountered
$$
D_p = \int_{\mathbb{R}^p}\int_{\mathbb{R}^p} (x – y)^{\mathrm{T}}(x – y) \, f_1(x)f_2(y) \, dy dx.
$$

Mathematica 12 showed me:
\begin{align*}
D_1
& = \int_{-\infty}^\infty\int_{-\infty}^\infty (x – y)^2 f_1(x) f_2(y) \, dydx \\
& = (\mu_1 – \mu_2)^2 + (\sigma_1^2 + \sigma_2^2),
\end{align*}

where
$$
f_1(x) = \frac{1}{\sqrt{2\pi}\,\sigma_1} \exp\biggl[-\frac{1}{2}\left(\frac{x – \mu_1}{\sigma_1}\right)^2 \, \biggr],
\quad
f_2(y) = \frac{1}{\sqrt{2\pi}\,\sigma_2} \exp\biggl[-\frac{1}{2}\left(\frac{y – \mu_2}{\sigma_2}\right)^2 \, \biggr],\\
$$

\begin{align*}
D_2
& = \int_{-\infty}^\infty\int_{-\infty}^\infty\int_{-\infty}^\infty\int_{-\infty}^\infty
[(x_1 – x_2)^2 + (y_1 – y_2)^2 ] f_1(x_1, y_1) f_2(x_2, y_2) \, dy_2dy_1 dx_2dx_1 \\
& = (\mu_{11} – \mu_{21})^2 + (\mu_{12} – \mu_{22})^2
+ (\sigma_{11}^2 + \sigma_{12}^2) + (\sigma_{21}^2 + \sigma_{22}^2)\\
& = (\mu_1 – \mu_2)^{\mathrm{T}}(\mu_1 – \mu_2) + \mathrm{tr}(\Sigma_1) + \mathrm{tr}(\Sigma_2)
\end{align*}

where
$$
\mu_1 = (\mu_{11}, \mu_{12})^{\mathrm{T}}, \quad
\mu_2 = (\mu_{21}, \mu_{22})^{\mathrm{T}},
$$

$$
\Sigma_1 =
\begin{pmatrix}
\sigma_{11}^2 & \rho_1\sigma_{11}\sigma_{12} \\
\rho_1\sigma_{11}\sigma_{12} & \sigma_{12}^2\\
\end{pmatrix}, \quad
\Sigma_2 =
\begin{pmatrix}
\sigma_{21}^2 & \rho_2\sigma_{21}\sigma_{22} \\
\rho_2\sigma_{21}\sigma_{22} & \sigma_{22}^2\\
\end{pmatrix},
$$

$$
|\rho_1| \leq 1, \quad
|\rho_2| \leq 1.
$$

I am guessing
\begin{equation}
D_p = (\mu_1 – \mu_2)^{\mathrm{T}}(\mu_1 – \mu_2) + \mathrm{tr}(\Sigma_1) + \mathrm{tr}(\Sigma_2).
\tag1\label1
\end{equation}

How can I prove equation \eqref{1}?

Best Answer

Suppose that $X$ and $Y$ are independent. Put $c = \mu_1 - \mu_2$, $A = X-EX$, $B = Y-EY$. Hence $$D_p = E(X-Y)^{T}(X-Y) =E(X-EX - (Y- EY) + c)^{T}(X-EX - (Y-EY) +c) = E(A-B+c)^{T}(A-B+c).$$

We have $A \sim N_p(0, \Sigma_1)$ and $B \sim N_p(0, \Sigma_2)$. Hence $EA=EB=0$, $EA^{T}=EB^{T}=0$, $DA = \Sigma_1$, $EA^{T}A = tr(\Sigma_1), EB^{T}B = tr(\Sigma_2)$. Thus

$$D_p = E(A-B)^{T}(A-B) + E(A-B)^{T}c + Ec^{T}(A-B)+Ec^{T}c$$ and

$$D_p - c^{T}c = E(A-B)^{T}(A-B) = EA^{T}A - EB^{T}A - EA^{T}B + E B^{T}B = EA^{T}A +E B^{T}B $$ because $EB^{T}A = EB^{T} EA = 0$ and $EA^{T}B = EA^{T}EB = 0$ by independence.

It follows that

$$D_p = c^{T}c + tr(\Sigma_1) + tr(\Sigma_2) = (\mu_1 - \mu_2)^{T}(\mu_1 - \mu_2) + tr(\Sigma_1) + tr(\Sigma_2)$$ q.e.d.

Related Question