[Math] Simulating from a Multivariate Gaussian without Cholesky

monte carlonumerical linear algebranumerical methodsprobability distributions

I'd like to draw a sample from a multivariate Gaussian distribution $\mathcal{N}(\mu, \Sigma)$, where $\mu$ is the mean vector (can assume it to be $\boldsymbol{0}$), and $\Sigma$ is a sparse positive definite covariance matrix. The standard way to draw a sample is to compute the Cholesky decomposition $\Sigma = L L^T$, draw a standard multivariate random normal $y \sim \mathcal{N}(\boldsymbol{0},I)$, then compute $x = Ly$ using e.g. back-substitution.

My question is as follows: I have a highly efficient black-box procedure for computing $\Sigma^{-1} y$ for arbitrary vectors $y$, but I do not have an efficient enough procedure for computing a Cholesky decomposition. Is it possible for me to efficiently draw a sample from this distribution? How?

Best Answer

Have you tried using PCA?

Instead of Cholesky factorisation, why not use PCA (principle component analysis, a.k.a. Eigenvalue decomposition)? If the matrix you have is sparse, then this may suggest that the covariance is well reconstructed (or approximated) using only a few of the dominant Eigenvalues. It is worth noting that if you can easily compute $\Sigma^{-1}y$ then you could easily use the power method with spectral shifts to find the smallest Eigenvalues of $\Sigma^{-1}$, which as they are real and positive (as $\Sigma$ is SPD and the Eigenvalues of $\Sigma$ are the reciprocals of $\Sigma^{-1}$). You can either find all of the Eigenvalues, or likely get away with just the most important few. Also, most numerical libraries and scientific software has extremely fast and/or robust routines for performing the Eigenvalue decomposition (cf. SVD).

If you're doing Monte Carlo, then why is Cholseky the rate limiting step?

It's unclear from the question why the Cholesky is the rate limiting step. You only need to compute the decomposition once and then just generate loads of samples. If it is the matrix multiplications that are taking alot of time, then again a truncated Eigenvalue decomposition might help. However, I can't think why this is the bottle neck in a MC application.

Related Solutions

[Math] Sample from multivariate normal distribution with given positive-semidefinite covariance matrix

They both seem to work quite well. They give slightly different results for the estimated covariance matrices of the generated series, but I wouldn't be surprised if it's due to rounding errors somewhere in the computations.

Below is some R code which generates samples from $N(0, \boldsymbol{\Sigma})$.

n <- 10000000

X <- cbind(rnorm(n), rnorm(n))

sigma <- t(matrix(c(0.666, -0.333, -0.333, 0.666), nrow=2))
spectral <- eigen(sigma)

X.spectral <- t(spectral$vectors %*% sqrt(diag(spectral$values)) %*% t(X))
X.cholesky <- t(t(chol(sigma)) %*% t(X))
cov(X.spectral)
cov(X.cholesky)

So with my 10,000,000 samples, the covariance matrices are

> cov(X.spectral)
           [,1]       [,2]
[1,]  0.6660626 -0.3331138
[2,] -0.3331138  0.6658130
> cov(X.cholesky)
           [,1]       [,2]
[1,]  0.6660344 -0.3328923
[2,] -0.3328923  0.6656198

[Math] Efficient computation of Cholesky decomposition during tridiagonal matrix inverse

Here is my matlab code

     function[L]=MyChol(A)

     [n,m]=size(A);
     L=eye(n);
     for k=1:n-1
           L(k,k)=sqrt(A(k,k));
           L(k+1:n,k)=(A(k+1:n,k))/L(k,k);
           A(k+1:n,k+1:n)=A(k+1:n,k+1:n)-L(k+1:n,k)*L(k+1:n,k)';
     end
     L(n,n)=sqrt(A(n,n));
     end

$A=LL^T$