Understand notation $\otimes$ for multivariate Gaussian distribution

kronecker productlinear algebranormal distributionprobability distributions

I am confused about the notation $\otimes$ for multivariate Gaussian distribution. For $K=(K_{s,r})$ and $1\le s, r \le n$ a n by n covariance matrix, we write
$$(Z_1, Z_2, \dots, Z_n)\sim \mathcal{N}(0, K \otimes I_m)$$
to mean that $Z_1, Z_2, \dots, Z_n$ is a collection of centered, jointly Gaussian random vectors in $R^m$, with covariances $\mathbb{E}(Z_sZ_r^T)=K_{s,r}I_m$.

Q1: What is $\mathcal{N}(0, K \otimes I_m)$? I remember that we always use the notation $\mathcal{N}(\mu, \Sigma)$ to represent a multivariate Gaussian distribution with mean vector $\mu$ and covariance $\Sigma$. Is there any difference?

I know that is

$$
\begin{bmatrix}
K_{11} & K_{12} & K_{13} & \dots & K_{1n} \\
K_{21} & K_{22} & K_{23} & \dots & K_{2n} \\
… \\
K_{n1} & K_{n2} & K_{n3} & \dots & K_{nn}
\end{bmatrix} \otimes I_m$$

Q2: Why $\mathbb{E}(Z_sZ_r^T)=K_{s,r}I_m$?

Best Answer

By definition, covariance between the random vectors $Z_s$ and $Z_r$ is

$$\operatorname{Cov}(Z_s,Z_r)=E[(Z_s-E(Z_s))(Z_r-E(Z_r))^T]$$

The above is an $m\times m$ matrix since each $Z_i$ is an $m\times 1$ vector. And because $Z_i$ is centered, its mean vector is $E(Z_i)=0$. This gives you

$$E[Z_sZ_r^T]=K_{s,r}I_m$$

The dispersion of $(Z_1,Z_2,\ldots,Z_n)$ is then the $mn\times mn$ matrix $\Sigma=(K_{s,r}I_m)_{1\le s,r\le n}$ :

$$\Sigma=\begin{bmatrix} K_{11}I_m & K_{12}I_m & \cdots & K_{1n}I_m \\ K_{21}I_m & K_{22}I_m & \cdots & K_{2n}I_m \\ \vdots & \vdots & \cdots & \vdots \\ K_{n1}I_m & K_{n2}I_m & \cdots & K_{nn}I_m \end{bmatrix} $$

This is exactly the Kronecker product of the matrices $K=(K_{s,r})_{1\le s,r\le n}$ and $I_m$, written as

$$\Sigma=K \otimes I_m$$

Best Answer

Related Solutions

[Math] Simulating from a Multivariate Gaussian without Cholesky

Have you tried using PCA?

If you're doing Monte Carlo, then why is Cholseky the rate limiting step?

Related Question