[Math] Sample points from a multivariate normal distribution using only the precision matrix

linear algebra

I have a problem where I can directly compute the (sparse) precision matrix (inverse of the covariance) of a multivariate normal distribution, but the covariance itself is not sparse and I don't want to invert things. I would like to sample points using the precision matrix. What's a fast way to do this?

I do know that the standard procedure is to compute the cholesky decomposition of the covariance, that is find lower triangular matrix L such that $LL' = \Sigma$ and then use a univariate generator to compute a vector of iid normal points $u$ so that finally the vector $z = \mu + Lu$ has the correct covariance. But that would require me computing the covariance AND then taking its Cholesky decomposition. Is there anything faster, using the fact that I have the precision matrix?

Best Answer

As is pointed out in the statement, any matrix decomposition $\mathbf{L}$ such that $\mathbf{LL}^\top = \boldsymbol{\Sigma}$ gives you a way to sample from the multivariate Gaussian distribution. Simply set $\boldsymbol{z} = \boldsymbol{\mu} + \mathbf{L}\boldsymbol{y}$, where $\boldsymbol{y}$ is a vector of independent univariate Gaussian variates and $\boldsymbol{\mu}$ your mean vector.

The matrix decomposition is not unique, so an efficient (and convenient way) to use the sparsity is to find the Cholesky decomposition of the precision matrix.

Starting with the positive-definite precision matrix $\boldsymbol{\Sigma}^{-1}$, compute its sparse Cholesky decomposition as $\boldsymbol{\Sigma}^{-1}=\mathbf{T}\mathbf{T}^\top$, where $\mathbf{T}$ is a lower triangular (sparse) matrix. The inverse of that Cholesky root, the lower triangular matrix $\mathbf{T}^{-1}$, can then be obtained by back-solving. There are dedicated algorithms to do these calculations efficiently using the sparse structure (these may require reordering).

Since $\mathbf{T}^{-\top} \mathbf{T}^{-1} = \boldsymbol{\Sigma}$, you can use the matrix $\mathbf{L} \equiv \mathbf{T}^{-\top}$ in your algorithm.