MATLAB: How to vectorize a specific for-loop

for looptutorialvectorization

I am trying to vectorize the for-loop hereafter. Would you have any hint? Thank you
for i = 1 : numel(text)-k+1 % "text" is a string
pattern(i,:) = text(i:i+k-1);
end

Best Answer

Before you start a vectorization, care for a pre-allocation:
n = numel(str) - k + 1;
pattern = repmat(' ', n, k); % <-- added
for ii = 1:n
pattern(ii,:) = str(ii:ii+k-1); % [EDITED, was: pattern(k, :)]
end
For a string with 10'000 characters and k=6 this 8 times faster already. But you can get much faster with a loop:
n = numel(str) - k + 1;
pattern = repmat(' ', n, k);
for ii = 1:k % <-- k instead of n
pattern(:, ii) = str(ii:ii+n-1); % <-- (:, ii) instead of (ii, :)
end
Now the values are copied to a continuous block of memory, which is much faster. In addition the loop is much shorter assumed that k << n.
[EDITED] For your amusement a vectorized version:
index = bsxfun(@plus, (1:numel(str) - k + 1).', 0:k-1);
pattern = str(index);
[EDITED 2] And if we are on the way, a C-MEX for completeness:
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mwSize k, n, f;
mxChar *in, *out;
mwSize dims[2];
// Get inputs:
in = (mxChar *) mxGetData(prhs[0]);
k = (mwSize) mxGetScalar(prhs[1]);
// Create outputs:
n = mxGetNumberOfElements(prhs[0]) - k + 1;
dims[0] = n;
dims[1] = k;
plhs[0] = mxCreateCharArray(2, dims);
out = (mxChar *) mxGetData(plhs[0]);
// Copy data:
f = out + n * k;
while (out < f) {
memcpy(out, in++, n * sizeof(mxChar));
out += n;
}
}
Timings for Matlab 2015b, Win64, mean of 100 iterations:
str = repmat(' ', 1, 10000);
k = 6;
Original: 0.134 sec
Pre-allocation: 0.0178 sec
Vectorized: 0.000806 sec
Continuous: 0.000348 sec
C-MEX: 0.0000529 sec
Now use the profiler to find out, if this piece of code is still the bottleneck of the program. If this piece of code takes 1% of the processing time only, accelerating it by a factor of 2 let the total program run only 0.5% faster. Therefore it is most likely a waste of time.
[EDITED 2] Conclusion:
  • The continuous copying is the fastest M-version I can imagine.
  • Vectorizing wastes time with creating the large index. Do not overestimate vectorization, when you do not operate on complete matrices.
  • The C-MEX is 7 times faster than the continuous copy version.
  • Avoiding the creation of the large redundant array would be much more efficient. Better use str(ii:ii+k-1) in the code instead of the fluffy pattern matrix.