Hi,
I am used to the fact that replacing a for-loop with vectorization in Matlab yields speed improvements (or at least does not degrade speed). I was trying to optimize code that relies on normcdf computation and found that vectorization actually makes things worse – with almost 2x slow-down.
Here is the code I use to verify that (simplified version for the purpose of highlighting an issue, outputs are not stored):
a = -50:0.01:50;total_cnt = 10000;a_mod = repmat(a, 1, total_cnt);fprintf(1, 'Calculating single long array\n');tic;normcdf(a_mod, 0, 1);toc;fprintf(1, 'Calculating short array with for loop\n');tic;for k = 1 : total_cnt normcdf(k*a, 0, 1);endtoc;
In both instances I calculate normcdf for exactly the same number of values (~10^8) – when doing for-loop I even change the underlying values to avoid potential simply caching of calculations (not sure if Matlab does it), so that each iteration is brand new set of values. Here is an example output (2015 Macbook Pro, Matlab 2018a):
Calculating single long arrayElapsed time is 3.396627 seconds.Calculating short array with for loopElapsed time is 1.450725 seconds.
It looks like I get faster computation in for-loop than with single line – is there a reason for this? I did not have time to experiment too much with it, but I tried reduce the size of a by 10x (a = -50:0.1:50) and then vectorized version became a little faster (0.35s vs 0.44s). I wonder if there is some threshold I am hitting?
Best Answer