Do you really want to vectorize the loop? Or is it enough to increase the speed remarkably? For the latter, use the profiler to find the bottleneck at first. Pre-allocate cp and use a leaner version of polyval:
function YourFcn
k = 1:0.5:10;
a = 1:0.5:5;
n1 = numel(k);
nump = kron(k,ones(1,length(a)))';
n2 = numel(a);
denp = repmat([ones(n2,1), a.', zeros(n2,1)] ,n1, 1);
w = [0.1,0.5,0.8,1,2,8,15,50,100];
delay = 0;
[rn,cn] = size(nump);
[rd,cd] = size(denp);
[rw,cw] = size(w);
[rdel,cdel] = size(delay);
if rw > 1
w = w(:)';
end
s = 1i * w;
mx = max(rn,rd);
cp = zeros(mx, numel(s));
for h = 1:mx
q = (rn>1)*h + (rn==1);
r = (rd>1)*h + (rd==1);
d = (rdel>1)*h + (rdel==1);
upper = mypolyval(nump(q,:), s);
lower = mypolyval(denp(r,:), s);
cp(h, :) = (upper ./ lower) .* exp(-s * delay(d));
end
end
function y = mypolyval(p,x)
nc = length(p);
y = zeros(size(x));
y(:) = p(1);
for i = 2:nc
y = x .* y + p(i);
end
end
This reduced the runtime on my computer from 0.46 to 0.21 seconds for 100 calls.
Use the profile with showing the time for the built-in functions:
profile on -detail builtin
This reveals the time spent in the expensive exp() function. Avoid the repeated calculation of the same exp output and inline the Horner scheme:
s = 1i * w;
mx = max(rn, rd);
cp = zeros(mx, numel(s));
if rdel == 1
C = exp(-s * delay(1));
end
nc1 = size(nump, 2);
nc2 = size(denp, 2);
upper = zeros(size(s));
lower = zeros(size(s));
for h=1:mx
q = (rn > 1) * h + (rn == 1);
r = (rd > 1) * h + (rd == 1);
if rdel > 1
C = exp(-s * delay(h));
end
upper(:) = nump(q, 1);
for i = 2:nc1
upper = s .* upper + nump(q, i);
end
lower(:) = denp(r, 1);
for i = 2:nc2
lower = s .* lower + denp(r, i);
end
cp(h,:) = (upper./lower) .* C;
end
Now it runs in 0.104 seconds, a speedup of factor 4.5 without vectorizing.
Best Answer