Here is a simple comparison of the 4 training functions on the dataset used in the fitnet and feedforwardnet documentation.
close all, clear all, clc
[ x, t ] = simplefit_dataset;
vart = var(t,0)
figure, subplot(211), hold on
plot(x), plot(t,'r')
subplot(212), plot( x,t ,'r')
% t vs x shows 4 local extrema. Therefore choose
H=4
net1 = fitnet(H); net1.trainFcn = 'trainlm';
net2 = fitnet(H); net2.trainFcn = 'trainbfg';
net3 = fitnet(H); net3.trainFcn = 'trainrp';
net4 = fitnet(H); net4.trainFcn = 'traingda';
rng(0), [ net1 tr1 y1 e1 ] = train(net1,x,t);
rng(0), [ net2 tr2 y2 e2 ] = train(net2,x,t);
rng(0), [ net3 tr3 y3 e3 ] = train(net3,x,t);
rng(0), [ net4 tr4 y4 e4 ] = train(net4,x,t);
NMSE1 = mse(e1)/vart
NMSE2 = mse(e2)/vart
NMSE3 = mse(e3)/vart
NMSE4 = mse(e4)/vart
Although it is a nice example, it doesn't prove much because, in addition to default parameter values (e.g, mu and min_grad) the rankings depend on
1. The underlying function t = f(x)
2. Random datadivision
3. Random initial weights
For a serious comparision with nontrivial data, you would need at least tens of repetitions.
Hope this helps.
Greg
Best Answer