Multivariate Normal Distribution – Generating Values from It

algorithmsMATLABmultivariate normal distributionrandom-generation

I am currently trying to simulate values of a $N$-dimensional random variable $X$ that has a multivariate normal distribution with mean vector $\mu = (\mu_1,…,\mu_N)^T$ and covariance matrix $S$.

I am hoping to use a procedure similar to the inverse CDF method, meaning that I want to first generate a $N$-dimensional uniform random variable $U$ and then plug that into the inverse CDF of this distribution, so to generate value $X$.

I am having issues because the procedure is not well documented and there are slight differences between the mvnrnd function in MATLAB and a description that I found on Wikipedia.

In my case, I am also choosing the parameters of the distribution randomly. In particular, I generate each of the means, $\mu_i$, from a uniform distribution $U(20,40)$. I then build the covariance matrix $S$ using the following procedure:

  1. Create a lower triangular matrix $L$ where $L(i,i) = 1$ for $i=1..N$ and $L(i,j) = U(-1,1)$ for $i < j$

  2. Let $S = LL^T$ where $L^T$ denotes the transpose of $L$.

This procedure allows me to ensure that $S$ is symmetric and positive definite. It also provides a lower triangular matrix $L$ so that $S = LL^T$, which I believe is required to generate values from the distribution.

Using the guidelines on Wikipedia, I should be able to generate values of $X$ using a $N$-dimensional uniform as follows:

  • $X = \mu + L * \Phi^{-1}(U)$

According to the MATLAB function however, this is typically done as:

  • $X = \mu + L^T * \Phi^{-1}(U)$

Where $\Phi^{-1}$ is the inverse CDF of a $N$-dimensional, separable, normal distribution, and the only difference between both methods is simply whether to use $L$ or $L^T$.

Is MATLAB or Wikipedia the way to go? Or are both wrong?

Best Answer

If $X \sim \mathcal{N}(0,I)$ is a column vector of standard normal RV's, then if you set $Y = L X$, the covariance of $Y$ is $L L^T$.

I think the problem you're having may arise from the fact that matlab's mvnrnd function returns row vectors as samples, even if you specify the mean as a column vector. e.g.,

 > size(mvnrnd(ones(10,1),eye(10))  
 > ans =
 >      1    10

And note that transforming a row vector gives you the opposite formula. if $X$ is a row vector, then $Z = X L^T$ is also a row vector, so $Z^T = L X^T$ is a column vector, and the covariance of $Z^T$ can be written $E[Z^TZ] = LL^T$.

Based on what you wrote though, the Wikipedia formula is correct: if $\Phi^{-1}(U)$ were a row vector returned by matlab, you can't left-multiply it by $L^T$. (But right-multiplying by $L^T$ would give you a sample with the same covariance of $LL^T$).