[Math] Sample from multivariate normal distribution with given positive-semidefinite covariance matrix

matricesnormal distributionprobability distributions

I want to draw a random vector from a multivariate normal distribution with given covariance matrix $Σ$.

I'm following this algorithm:

A widely used method for drawing a random vector $x$ from the $N$-dimensional multivariate normal distribution with mean vector $μ$ and covariance matrix $Σ$ works as follows:

  1. Find any real matrix $A$ such that $AA^T = Σ$. When $Σ$ is positive-definite (…)

But my covariance matrix is not positive-definite. It's positive-semidefinite. So, I can't use Cholesky decomposition. Further there is written:

An alternative is to use the matrix $A = UΛ^{\frac{1}{2}}$ obtained from a spectral decomposition $Σ = UΛU^T$ of $Σ$.

So, I have to use spectral decomposition? There is no easier way to draw such a random vector?


For such $Σ$:

 0.666 -0.333
-0.333  0.666 

Cholesky decompososition gives such $A$ (using this Online Matrix Calculator):

 0.816  0.000
-0.408  0.707

Spectral decomposition, eigenvectors real values $U$:

 0.707  0.707
-0.707  0.707

Eigenvalues $Λ$:

 0.999  0.000
 0.000  0.333

Fortunately square root of diagonal matrix is easy 🙂 $Λ^{\frac{1}{2}}$:

 0.999  0.000
 0.000  0.577

$A=UΛ^{\frac{1}{2}}$:

 0.707  0.407
-0.707 0.407

So, $A$ returned by spectral decomposition is different than this returned by Cholesky. Why? Is this correct?

Best Answer

They both seem to work quite well. They give slightly different results for the estimated covariance matrices of the generated series, but I wouldn't be surprised if it's due to rounding errors somewhere in the computations.

Below is some R code which generates samples from $N(0, \boldsymbol{\Sigma})$.

n <- 10000000

X <- cbind(rnorm(n), rnorm(n))

sigma <- t(matrix(c(0.666, -0.333, -0.333, 0.666), nrow=2))
spectral <- eigen(sigma)

X.spectral <- t(spectral$vectors %*% sqrt(diag(spectral$values)) %*% t(X))
X.cholesky <- t(t(chol(sigma)) %*% t(X))
cov(X.spectral)
cov(X.cholesky)

So with my 10,000,000 samples, the covariance matrices are

> cov(X.spectral)
           [,1]       [,2]
[1,]  0.6660626 -0.3331138
[2,] -0.3331138  0.6658130
> cov(X.cholesky)
           [,1]       [,2]
[1,]  0.6660344 -0.3328923
[2,] -0.3328923  0.6656198