MATLAB: Speeding up the fminbnd function

acceleratefminbnd

Hi!
I have a function of T as,
CV_inf = @(T) T.*(4/(n+1)) - integral(@(t) psitildasq(t),(-T),(T),'Arrayvalued',true);
where
psitildasq = @(t) (1/n^2)*(sum(cos(x*t))).^2 + (1/n^2)*(sum(sin(x*t))).^2;
x is a (1*n) vector of inputs.
I want to find the global minimum as well as the first(smallest) local minimum of this function.
To find the global minimum I use
[Tinf,Tinf_val,Tinf_exitflag,output] = fminsearch(CV_inf,0);
and to find the local minimum I follow this method,
[Tloc1,Tloc1_val,Tloc1_exitflag] = fminbnd(CV_inf,0,Tinf);
g = Tinf - Tloc1;
if g <= 0.0001
Tloc = Tinf;
Tloc_val = Tloc1_val;
Tloc_exitflag = Tloc1_exitflag;
else
while g > 0.0001
[Tloc,Tloc_val,Tloc_exitflag] = fminbnd(CV_inf,0,Tloc1);
g = Tloc1 - Tloc;
Tloc1 = Tloc;
end
end
But when I have a large dataset like n=10000, the code seems to run forever. Could there be a possibility of speeding up the process?
Thanks.

Best Answer

If you factor the 1/n^2 out of the integral, then what is left can be integrated into a closed form that has a predictable pattern. You can calculate based upon that pattern and in so doing save all of the integrations.
The indefinite integral of
sum(cos(x[k]*t), k = 1 .. n)^2 + sum(sin(x[k]*t), k = 1 .. n)^2
is (n*t + 2 * R)
where R is
sum( sum( sin((x[p]-x[q])*t)/(x[p]-x[q]), q = 2..n), p = 1..n-1 )
so sin() of the pairwise difference in x, times t, divided by the difference in x, where the first index of the subtraction is less than the second index of the subtraction.
Another way of writing 2*R would be
sum( sum( sin((x[p] - x[q])*t) / (x[p] - x[q]), q = 1 .. n), p = 1 .. n)
Notice that the differences can be calculated ahead of time. Using the full form as being easier to write out,
D = bsxfun(@minus, x(:), x(:).');
and then for any one t, sum(sum(sin(D*t)./D)). Optimize slightly by going for vector form, calculate ahead of time,
D = reshape( bsxfun(@minus, x(:), x(:).'), [], 1); %column vector

and for any one t, sum(sin(D*t)./D)
See below for a correction
With regards to the definite integration over -T to +T, notice that since sin(-x) = -sin(x), evaluating the sum at -T is going to be the negative of the sum evaluated at T. In just the same way, the leading term n*t at t=-T would be the negative of n*t at t=T. The indefinite integral is symmetric around therefore symmetric around 0, and so going over -T to +T is going to be 2 * going over +T by itself. The definite integral is therefore going to be
2*n*T + 2 * (the sum from above, evaluated at t=T)
With these optimizations, execution should be comparatively fast provided that you can store all of D in memory.
Remember the constant multiplier of 1/n^2 that got moved out of the integral.
Note: if it is possible that two of the variables might be the same then adjustments need to be made to the integral. The term sin((p-q)*t)/(p-q) would become 0 / 0 when p = q. This value effectively has to be treated as being t in order to get the calculations to work out properly. Indeed, this is the source of the n*t term, since there would be n places where the pairwise variables would be equal. This points out a problem in my writing D in terms of bsxfun: it is going to generate 0 terms. This leads to a slight reformulation: continue to have
D = reshape( bsxfun(@minus, x(:), x(:).'), [], 1); %column vector
but now omit the n*t leading term, and instead use
Dr = D(D ~= 0);
num_zero_pairs = n^2 - length(Dr);
and then the indefinite integral is
num_zero_pairs * t + sum(sin(Dr *t) ./ Dr)
and the definite over -T to +T would be
2 * (num_zero_pairs * T + sum(sin(Dr.*t) ./ Dr)))
again remember the 1/n^2 multiplier
for n = 10000 then length(D) would be 50005000, 50 meg entries, 400 megabytes. Do-able. I guess it would be a fair question as to whether this exact integration would possibly be more expensive than numeric integration. It might make it easier to reason about CV_inf though.