[Math] Understanding the matrix normal distribution

kronecker productnormal distributionprobability distributionsprobability theoryrandom matrices

A random $n \times p$ matrix $X$ is distributed according to a matrix valued normal distribution iff $\mathrm{vec}(X) \sim \mathcal{N}_{np}(\mu, V \otimes U)$, where $\mu \in \mathbb{R}^{np}$ is a vector, the matrices $U \in \mathbb{R}^{n \times n}$, $V \in \mathbb{R}^{p \times p}$ are positive definite and $\otimes$ denotes the Kronecker product.

My question is: Why do we restrict the possible covariance matrix of $\mathrm{vec}(X)$ to those matrices that are Kronecker products? I would have expected that a random matrix is matrix normal distributed iff the joint distribution of the entries is normal. Can anyone give me an intuition as to why this restriction is reasonable?

Best Answer

A random Gaussian process $v = (v_k)$ with a covariance matrix $U$ can be represented by $v = U^{1/2} g$, where $g$ is a vector of i.i.d. $\mathcal N(0,1)$ random variables.

So it would seem reasonable that a $(n \times p)$ matrix is called "distributed according to a matrix valued normal distribution" if it has some kind of representation like: $$ X = U^{1/2} G V^{1/2} $$ where $G$ is a $(n\times p)$ matrix of i.i.d. $\mathcal N(0,1)$ random variables.

Let's check (remembering that $U^{1/2}$ and $V^{1/2}$ are symmetric, and that $E(G_{rs} G_{tu}) = \delta_{rt} \delta_{su} $): $$ E(X_{ij} X_{kl}) = E(\sum_{r,s,t,u} U^{1/2}_{ir}G_{rs}V_{sj}^{1/2} U^{1/2}_{kt}G_{tu}V_{ul}^{1/2}) \\ = \sum_{r,s} U^{1/2}_{ir}V_{sj}^{1/2} U^{1/2}_{kr}V_{sl}^{1/2} \\ = \sum_{r} U^{1/2}_{ir} U^{1/2}_{rk} \sum_s V_{js}^{1/2} V_{sl}^{1/2} \\ = U_{ik} V_{jl} = (U \otimes V)_{(i,j),(k,l)} $$ Yes, it works!

Also, if we allowed the entries of $X$ to have any covariance matrix, we could call it a matrix whose entries come from a Gaussian process - in other words, the rectangular structure of the matrix is completely forgotten.

Related Solutions

[Math] Sample from multivariate normal distribution with given positive-semidefinite covariance matrix

They both seem to work quite well. They give slightly different results for the estimated covariance matrices of the generated series, but I wouldn't be surprised if it's due to rounding errors somewhere in the computations.

Below is some R code which generates samples from $N(0, \boldsymbol{\Sigma})$.

n <- 10000000

X <- cbind(rnorm(n), rnorm(n))

sigma <- t(matrix(c(0.666, -0.333, -0.333, 0.666), nrow=2))
spectral <- eigen(sigma)

X.spectral <- t(spectral$vectors %*% sqrt(diag(spectral$values)) %*% t(X))
X.cholesky <- t(t(chol(sigma)) %*% t(X))
cov(X.spectral)
cov(X.cholesky)

So with my 10,000,000 samples, the covariance matrices are

> cov(X.spectral)
           [,1]       [,2]
[1,]  0.6660626 -0.3331138
[2,] -0.3331138  0.6658130
> cov(X.cholesky)
           [,1]       [,2]
[1,]  0.6660344 -0.3328923
[2,] -0.3328923  0.6656198

[Math] Find the marginal distributions (PDFs) of a multivariate normal distribution

The easiest way to convince yourself that the statement from Wikipedia you highlighted above is true is by use of the characteristic function. As we know it is defined as follows: \begin{equation} \kappa_\vec{X}(\vec{k}) := E\left[ e^{\imath \vec{k} \cdot \vec{X}} \right] \end{equation}

Now in case $\vec{X}$ is multivariate normal with means $\vec{\mu}$ and covariances ${\bf \Sigma}$,i.e. $\vec{X} \sim N(\vec{\mu},{\bf \Sigma})$ then from https://en.wikipedia.org/wiki/Multivariate_normal_distribution we have: \begin{equation} \kappa_\vec{X}(\vec{k}) = e^{\imath \vec{\mu}^T \cdot \vec{k} - \frac{1}{2} \vec{k}^T \cdot {\bf \Sigma} \cdot \vec{k}} \end{equation}

Now to marginalize in $\vec{k}$-space is easy. All we need to do is to set the respective components, i.e. those corresponding to variables we want to marginalize over in real space, of the $\vec{k}$-vector to zero. therefore we take ${\vec{k}_1} := \left( 0,\cdots 0, k_j, 0, \cdots, 0 \right)$ where the non-zero element sits at the jth position. Then clearly we have: \begin{equation} \kappa_\vec{X}(\vec{k}_1) = e^{\imath \mu_j k_j - \frac{1}{2} k_j {\bf \Sigma}_{j,j} k_j} \end{equation} Now in order to get the marginal distribution all we need to do is to invert the Fourier transform. We have: \begin{eqnarray} \rho^{(marg)}_j(x_j) &:=& \frac{1}{2\pi} \int\limits_{-\infty}^{\infty} e^{-\imath k_j x_j} \kappa_\vec{X}(\vec{k}_1) d k_j \\ &=& \frac{1}{2\pi} \int\limits_{\mathbb R} e^{-\imath k_j (x_j-\mu_j) - \frac{1}{2} {\bf \Sigma}_{j,j} k_j^2} dk_j \\ &=&e^{-\frac{1}{2} \frac{(x_j-\mu_j)^2}{{\bf \Sigma}_{j,j}} } \cdot \frac{1}{2\pi} \cdot \int\limits_{\mathbb R} e^{-\frac{1}{2} {\bf \Sigma}_{j,j} k_j^2 }dk_j \\ &=&\frac{1}{\sqrt{2 \pi {\bf \Sigma}_{j,j}}} e^{-\frac{1}{2} \frac{(x_j-\mu_j)^2}{{\bf \Sigma}_{j,j}} } \end{eqnarray} as expected.

Best Answer

Related Solutions

[Math] Sample from multivariate normal distribution with given positive-semidefinite covariance matrix

[Math] Find the marginal distributions (PDFs) of a multivariate normal distribution

Related Question