[Math] Simulating from a Multivariate Gaussian without Cholesky

monte carlonumerical linear algebranumerical methodsprobability distributions

I'd like to draw a sample from a multivariate Gaussian distribution $\mathcal{N}(\mu, \Sigma)$, where $\mu$ is the mean vector (can assume it to be $\boldsymbol{0}$), and $\Sigma$ is a sparse positive definite covariance matrix. The standard way to draw a sample is to compute the Cholesky decomposition $\Sigma = L L^T$, draw a standard multivariate random normal $y \sim \mathcal{N}(\boldsymbol{0},I)$, then compute $x = Ly$ using e.g. back-substitution.

My question is as follows: I have a highly efficient black-box procedure for computing $\Sigma^{-1} y$ for arbitrary vectors $y$, but I do not have an efficient enough procedure for computing a Cholesky decomposition. Is it possible for me to efficiently draw a sample from this distribution? How?

Best Answer

Have you tried using PCA?

Instead of Cholesky factorisation, why not use PCA (principle component analysis, a.k.a. Eigenvalue decomposition)? If the matrix you have is sparse, then this may suggest that the covariance is well reconstructed (or approximated) using only a few of the dominant Eigenvalues. It is worth noting that if you can easily compute $\Sigma^{-1}y$ then you could easily use the power method with spectral shifts to find the smallest Eigenvalues of $\Sigma^{-1}$, which as they are real and positive (as $\Sigma$ is SPD and the Eigenvalues of $\Sigma$ are the reciprocals of $\Sigma^{-1}$). You can either find all of the Eigenvalues, or likely get away with just the most important few. Also, most numerical libraries and scientific software has extremely fast and/or robust routines for performing the Eigenvalue decomposition (cf. SVD).

If you're doing Monte Carlo, then why is Cholseky the rate limiting step?

It's unclear from the question why the Cholesky is the rate limiting step. You only need to compute the decomposition once and then just generate loads of samples. If it is the matrix multiplications that are taking alot of time, then again a truncated Eigenvalue decomposition might help. However, I can't think why this is the bottle neck in a MC application.