Solved – Definition of independence of two random vectors and how to show it in the jointly normal case

independencemultivariate normal distribution

(1) What is the definition of independence between two random vectors $\mathbf X$ and $\mathbf Y$? I think it's more than just pairwise independence between all the elements $X_i$ and $Y_j$.

(2) The interpretation of having the covariance matrix between two random vectors equal to zero is that the elements of the vectors have pairwise covariance zero, because the $ij^{th}$ element of the covariance matrix is the covariance between $X_i$ and $Y_j$. In the joint normal setting, this is the same as saying the elements of the vectors are pairwise independent. How come this is enough to conclude that the two vectors are completely independent, i.e. not just pairwise independent?

Best Answer

(1) What is the definition of independence between two random vectors $\mathbf X$ and $\mathbf Y$?

The definition of independence between two random vectors is the same as that between two ordinary random variables: Random vectors $\mathbf{x}$ and $\mathbf{y}$ are independent if and only if their joint distribution is equal to the product of their marginal distributions. That is:

$$p(x, y) = p(x) p(y)$$

Or, to write things explicitly in terms of the individual elements of each vector, let $\mathbf{x} = [\mathbf{x}_1, \dots, \mathbf{x}_n]^T$ and $\mathbf{y} = [\mathbf{y}_1, \dots, \mathbf{y}_m]^T$. Then $\mathbf{x}$ and $\mathbf{y}$ are independent if and only if:

$$p(x_1, \dots, x_n, y_1, \dots, y_m) = p(x_1, \dots, x_n) p(y_1, \dots, y_m)$$

Note that the elements of $\mathbf{x}$ may depend on each other, and likewise for $\mathbf{y}$. But there's no dependence between the elements of $x$ and $y$.

(2) The interpretation of having the covariance matrix between two random vectors equal to zero is that the elements of the vectors have pairwise covariance zero, because the $ij^{th}$ element of the covariance matrix is the covariance between $X_i$ and $Y_j$. In the joint normal setting, this is the same as saying the elements of the vectors are pairwise independent. How come this is enough to conclude that the two vectors are completely independent, i.e. not just pairwise independent?

This follows from the particular form of the Gaussian distribution. Suppose random vectors $\mathbf{x} \sim \mathcal{N}(\mu_x, C_x)$ and $\mathbf{y} \sim \mathcal{N}(\mu_y, C_y)$ are jointly Gaussian, and $\text{cov}(\mathbf{x_i}, \mathbf{y_j}) = 0$ for all $i,j$. Then $\mathbf{x}$ and $\mathbf{y}$ are independent because their joint distribution factors into the product of their marginal distributions.

Proof

We can write the joint distribution of $\mathbf{x}$ and $\mathbf{y}$ by concatenating them to form random vector $\mathbf{z} = \left[ \begin{matrix} \mathbf{x} \\ \mathbf{y} \end{matrix} \right]$. $\mathbf{z}$ has a Gaussian distribution with mean $\mu = \left[ \begin{matrix} \mu_x \\ \mu_y \end{matrix} \right]$. Because the covariance between all entries of $\mathbf{x}$ and $\mathbf{y}$ is zero, $\mathbf{z}$ has a block diagonal covariance matrix $C = \left[ \begin{matrix} C_x & \mathbf{0} \\ \mathbf{0} & C_y \\ \end{matrix} \right ]$. The joint density of $\mathbf{x}$ and $\mathbf{y}$ is equal to the density of $\mathbf{z}$:

$$p(x, y \mid \mu_x, \mu_y, C_x, C_y) = p(z \mid \mu, C) = \text{det}(2 \pi C)^{-\frac{1}{2}} \exp \left[ -\frac{1}{2} (z-\mu)^T C^{-1} (z-\mu) \right]$$

Because $C$ is block diagonal, its inverse is $C^{-1} = \left[ \begin{matrix} C_x^{-1} & \mathbf{0} \\ \mathbf{0} & C_y^{-1} \\ \end{matrix} \right ]$ and we can write:

$$(z-\mu)^T C^{-1} (z-\mu) = (x-\mu_x)^T C_x^{-1} (x-\mu_x) + (y-\mu_y)^T C_y^{-1} (y-\mu_y)$$

As a further consequence of the block diagonal form of $C$, the determinant is:

$$\text{det}(C) = \text{det}(C_x) \ \text{det}(C_y)$$

Using these facts and a little algebra, we can re-write the joint distribution as:

$$p(x, y \mid \mu_x, \mu_y, C_x, C_y) =$$

$$\text{det}(2 \pi C_x)^{-\frac{1}{2}} \exp \left[ -\frac{1}{2} (x-\mu_x)^T C_x^{-1} (x-\mu_x) \right]$$

$$\text{det}(2 \pi C_y)^{-\frac{1}{2}} \exp \left[ -\frac{1}{2} (y-\mu_y)^T C_y^{-1} (y-\mu_y) \right]$$

Clearly, this is just the product of the Gaussian marginal distributions of $\mathbf{x}$ and $\mathbf{y}$.

Related Question