MATLAB: Generate normally distributed sample from data

datamathematicsMATLABstatistics

Hi,
I have an array with many (>800000) rows. I want to select from one column 51 values to generate a new array with 51 normally distributed data. The values range from 0 to 10.
How can I do that?
Thanks,
Andrea

Best Answer

I need to be careful to not start any discussion about how one actually define a normal distribution, but starting from the point that you don't want a exact perfect definition of normal distributed data you can use the Anderson-Darling test. The idea is to randomly sample 51 points from your array and them check if they are normal or not. To get it more robust, you can simply save the value with the highest p-value:
rng(33)
ArraySize = 80000;
A = rand(ArraySize,1); % not normal
A(500:1000) = randn(501,1); % normal
Founded = 0;
MaxIter = 1000;
Maxp = 0;
Ite = 1;
while ~Founded && Ite<MaxIter
SampledIndex = randperm(ArraySize,51); % Sample from your array
Asampled = A(SampledIndex);
[h,p] = adtest(Asampled); % Check if normal
% You can theoretically umcomment this, I however belive that looking at the max p
% is more robust
%Founded = ~h; % 0 if normal (can't reject the null hypotesis it is not normal)
if p>Maxp % Save the one that got the closest
BestAsoFar = Asampled;
Maxp = p;
end
Ite = Ite+1;
end
histogram(BestAsoFar)