MATLAB: It gives worse results when I use Genetic Algorithm for training NNs than when I use Back-propagation

genetic algorithmneural networkneural network with genetic algorithnn-ga

I have an NN with 192 inputs and 48 outputs where I use it for electricity forecasting. It has only one hidden layer and neurone. Previously I used this with back-propagation. Now I want to have better results, so I train it with GA. But results with GA are worse than with BP (rarely get better results with GA). I have tried with different parameter arrangements (code is attached). But still, I cannot find the reason. I checked with different amount of training sets (10, 15, 20, 30) and different amount of hidden neurones. But when I increase them, results get even worse. Please, someone, help me for this.
Regards,
Dara
———————————-code——————————————
for i = 1:17;
p = xlsread('Set.xlsx')';
t = xlsread('Target.xlsx')';
IN = xlsread('input.xlsx')';
c = xlsread('Compare.xlsx')';
inputs = p(:,i+20:i+27);
targets = t(:,i+20:i+27);
in = IN(:,i);
C = c(:,i);
[I N ] = size(inputs)
[O N ] = size(targets)
H = 1;
Nw = (I+1)*H+(H+1)*O;
net = feedforwardnet(H);
net = configure(net, inputs, targets);
h = @(x) mse_test(x, net, inputs, targets);
ga_opts=gaoptimset('TolFun',1e(-20),'display','iter','Generations',2500,'PopulationSize',200,'MutationFcn',@mutationgaussian,'CrossoverFcn',@crossoverscattered,'UseParallel', true);
[x_ga_opt, err_ga] = ga(h, Nw,[],[],[],[],[],[],[], ga_opts);
net = setwb(net, x_ga_opt');
out = net(in)
Sheet = 1;
filename = 'Results.xlsx';
xlRange =['A',num2str(i)];
xlswrite(filename,x_ga_opt,Sheet,xlRange);
i = i + 1;
end
-------------------------------Objective Function---------------------------------
function mse_calc = mse_test(x, net, inputs, targets)
net = setwb(net, x');
y = net(inputs);
e = targets - y;
mse_calc = mse(e);
end

Best Answer

I think we should expect worse results with GA than with Backprop. Backpropagation training is a gradient-based minimization algorithm, meaning that wherever the weight vector is, it can calculate the direction of steepest descent and take a step downhill in that direction. But GA does not know the gradient; instead it uses a randomized local search around the current weight vector (and random recombinations), which could easily miss directions of descent sometimes. If you run it long enough it should eventually find its way downhill, but it's less efficient than gradient-based optimization.
The exception would be a function with many local minima, which will trap gradient descent. GA can better escape from those. But some recent papers on deep neural networks are suggesting that with neural networks, saddle points are more prevalent than true local minima, which might explain why gradient descent does so well on neural networks.