[Math] Understanding the matrix normal distribution

kronecker productnormal distributionprobability distributionsprobability theoryrandom matrices

A random $n \times p$ matrix $X$ is distributed according to a matrix valued normal distribution iff $\mathrm{vec}(X) \sim \mathcal{N}_{np}(\mu, V \otimes U)$, where $\mu \in \mathbb{R}^{np}$ is a vector, the matrices $U \in \mathbb{R}^{n \times n}$, $V \in \mathbb{R}^{p \times p}$ are positive definite and $\otimes$ denotes the Kronecker product.

My question is: Why do we restrict the possible covariance matrix of $\mathrm{vec}(X)$ to those matrices that are Kronecker products? I would have expected that a random matrix is matrix normal distributed iff the joint distribution of the entries is normal. Can anyone give me an intuition as to why this restriction is reasonable?

Best Answer

A random Gaussian process $v = (v_k)$ with a covariance matrix $U$ can be represented by $v = U^{1/2} g$, where $g$ is a vector of i.i.d. $\mathcal N(0,1)$ random variables.

So it would seem reasonable that a $(n \times p)$ matrix is called "distributed according to a matrix valued normal distribution" if it has some kind of representation like: $$ X = U^{1/2} G V^{1/2} $$ where $G$ is a $(n\times p)$ matrix of i.i.d. $\mathcal N(0,1)$ random variables.

Let's check (remembering that $U^{1/2}$ and $V^{1/2}$ are symmetric, and that $E(G_{rs} G_{tu}) = \delta_{rt} \delta_{su} $): $$ E(X_{ij} X_{kl}) = E(\sum_{r,s,t,u} U^{1/2}_{ir}G_{rs}V_{sj}^{1/2} U^{1/2}_{kt}G_{tu}V_{ul}^{1/2}) \\ = \sum_{r,s} U^{1/2}_{ir}V_{sj}^{1/2} U^{1/2}_{kr}V_{sl}^{1/2} \\ = \sum_{r} U^{1/2}_{ir} U^{1/2}_{rk} \sum_s V_{js}^{1/2} V_{sl}^{1/2} \\ = U_{ik} V_{jl} = (U \otimes V)_{(i,j),(k,l)} $$ Yes, it works!

Also, if we allowed the entries of $X$ to have any covariance matrix, we could call it a matrix whose entries come from a Gaussian process - in other words, the rectangular structure of the matrix is completely forgotten.

Related Question