MATLAB: How to use mvnrnd to generate a normal distribution within +- 1 sigma rather than the whole range (+- 3 sigma)

1 sigmaMATLABmvnrndnormal distribution

Does anyone could help with the problem, how to use mvnrnd to generate a normal distribution within +- 1 sigma rather than the whole range (+- 3 sigma), any such option? Thank you.

Best Answer

It is not normal if you restrict the range. Period. So the result of any such operation will NOT be normally distributed. PERIOD. Must I say it again? If I say so three times, it must be true. Normal distributions extend to infinity in both directions.
Note that +/-3 sigma is NOT the entire range. If you think it is, then you are wrong. Period.
Sigh. Yes, I know that someone will try to explain how one might do so anyway. There are two simple tricks to do this.
1. Rejection. Sample from a normal distribution, then throw away any samples that are outside of the desired range. The result will have the desired properties. So to get n samples out the end, you will need to take more than n samples, originally since you lose some. It will still not be normal, but then you probably know that by now. If the rejection rate is relatively low, then you need not do too much oversampling, and this will be fairly efficient.
2. Use of the inverse normal. This allows you to do sampling with no rejection needed. Somewhat more mental effort is needed, but not a lot.
Lets assume that we want to generate 1000 points, in the range [-1,1].
How large of a set must we oversample to do rejection sampling?
diff(normcdf([-1 1]))
ans =
0.68269
So we will toss out roughly 31% of the samples as exceeding +/- 1 sigma. ROUGHLY. That means we need to sample ROUGHLY
n = 1000;
n*1/0.68269
ans =
1464.8
So something around 1500 samples will be needed. I could give a better estimate using a binomial distribution with p=0.68269.
Checking wikipedia, since my brain sometimes gets these things wrong ... The mean of a binomial is n*p, the variance is n*p*(1-p). This should give me an estimate...
Nhat = ceil(n/p)
Nhat =
1465
Nhat*p + [-3 3]*sqrt(Nhat*p*(1-p))
ans =
946.7 1053.6
So, if I sample 1465 points, then I would expect to see somewhere between 950 and 1050 points that fall inside the interval [-1,1]. Therefore 1600 pre-rejection points will almost always leave me safe.
mu = 0;
sigma = 1;
X0 = normrnd(mu,sigma,[1,1600]);
X0((X0 < -1) | (X0 > 1)) = []; % rejection step
[min(X0),max(X0)]
ans =
-0.99422 0.98885
numel(X0)
ans =
1088
Xrej = X0(1:n); % keep the first 1000 samples
So rejection is trivial to do.
Option 2, using the inverse normal is easy enough too, but it requires you to understand these methods.
mu = 0;
sigma = 1;
n = 1000;
normallimits = [-1 1];
unilimits = normcdf(normallimits);
unisample = rand(1,n)*diff(unilimits) + unilimits(1);
Xtrans = norminv(unisample,mu,sigma);
So in the scheme for option 2, I sampled randomly from a UNIFORM distribution, then pushed them through an inverse normal to get the final set. No rejection was needed at all.
In both cases, the samples follow what would be technically called a truncated normal distribution. Had I taken a significantly larger sample, the result would look fairly smoothly like a truncated normal. With only 1000 points, the histogram won't look like much of anything.
Related Question