MATLAB: Some insight on mvnrnd function

MATLABmvnrndrandom number generatorstatistics

Hi, I am trying to simulate a simple bivariate VAR(1) process with Gaussian errors and I use mvnrnd function to draw from a multivariate normal with mean [0;0] and variance matrix Σ.
I was creating my errors outside of my for loop, but I realised the two approaches below give completely different results:
mu = [0;0]; sgm = [1 0.7; 0.7 1]; iter = 100;
rng(1)
E = mvnrnd(mu, sgm, iter);
E_e = zeros(iter,2);
rng(1)
for i = 1:iter
E_e(i,:) = mvnrnd(mu, sgm);
end
Can you give me some insight on why these two are different – I have some ideas but cannot formalise my thoughts.

Best Answer

For once, someone asks a good question, that says they saw something they did not understand, and want to understand what happened. :)
(Don't feel bad, I had to think for a second myself about it.) So what did happen? The answer is to change what you did, to a simpler problem. You had:
mu = [0;0]; sgm = [1 0.7; 0.7 1];
But let me changes sgm to an identity.
sgm = eye(2);
rng(1)
mvnrnd(mu, sgm,3)
ans =
-0.64901 -1.1096
1.1812 -0.84555
-0.75845 -0.57266
So 3 sets of normal events, as rows of that matrix. A standard normal, so every one comes from a normal N(0,1) distribution.
But now lets generate them one set at a time. Look carefully at the sequence.
>> rng(1)
>> mvnrnd(mu, sgm)
ans =
-0.64901 1.1812
>> mvnrnd(mu, sgm)
ans =
-0.75845 -1.1096
>> mvnrnd(mu, sgm)
ans =
-0.84555 -0.57266
Do you see that MATLAB generates the same set of numbers, but in a different sequence.
>> rng(1)
>> mvnrnd(0,1,6)
ans =
-0.64901
1.1812
-0.75845
-1.1096
-0.84555
-0.57266
The same 6 numbers. Now, think about how MATLAB stores numbers in an array. It goes down the columns.
>> reshape(1:6,[3,2])
ans =
1 4
2 5
3 6
So, when you generate the sets one pair at a time, it gives you the same sequence, but ONE pair at a time. Consider the difference between these next two calls, then look back at what you saw when you generated them one pair at a time.
>> rng(1)
>> reshape(mvnrnd(0,1,6),[3,2])
ans =
-0.64901 -1.1096
1.1812 -0.84555
-0.75845 -0.57266
>> rng(1)
>> reshape(mvnrnd(0,1,6),[2,3])'
ans =
-0.64901 1.1812
-0.75845 -1.1096
-0.84555 -0.57266
All well and good, but what did you do now? You changed sgm. This gets into how you generate a set of normal deviates with a non-identity covariance matrix. So how is that done? You generate a set of iid N(0,1) deviates, then you transform them with a matrix multiply. So effectively, mvnrnd(mu,sgm,N) calls randn, generating an array of the right size. Then it multiplies that Nx2 matrix with a 2x2 matrix derived from the supplied covariance matrix. (The transformation is derived from a Cholesky factorization of your covariance matrix. I can go into more depth there, but I think it may only confuse things, rather than add information.)
Now does it make more sense? When you generate them one pair at a time, you generate them in a different sequence.