MATLAB: Neural network back propagation problem

back propagationneural networksoverfitting

im using 2 inputs and single output. then the same network structure apply for 3 inputs and two outputs. however, i dont get too near output value. whats wrong with this network? or i need to change it to other type of structure?
clear all;clc;clear;
% load data % p=[0 0 1 1; 0 1 0 1]; % t = [0 1 1 0];
p = [0 0 0 0 1 1 1 1; 0 0 1 1 0 0 1 1; 0 1 0 1 0 1 0 1]; t = [0 1 0 0 0 1 1 1; 0 1 0 0 1 1 0 0];
net = newff(p,t,[15, 15],{'logsig','logsig'},'traingd');
net.trainParam.perf = 'mse'; net.trainParam.epochs = 100; net.trainParam.goal = 0; net.trainParam.lr = 0.9; net.trainParam.mc = 0.95; net.trainParam.min_grad = 0;
[net,tr] = train(net,p,t);
y=sim (net,p)'

Best Answer

% Ntrn/Nval/Ntest = 7/0/1
close all, clear all, clc
tic
ptrn = [0 0 0 0 1 1 1 ; 0 0 1 1 0 0 1 ; 0 1 0 1 0 1 0 ]
ttrn = [0 1 0 0 0 1 1 ; 0 1 0 0 1 1 0 ]
ptst = [ 1; 1; 1 ]
ttst = [ 1; 0 ]
[I Ntrn] = size (ptrn) % [ 3 7 ]
[O Ntrn] = size (ttrn) % [ 2 7 ]
Ntrneq = prod(size(ttrn)) % 14
MSEtrn00 = mean(var(ttrn',1)) % 0.2449
[I Ntst] = size(ptst) % [ 3 1 ]
% Nw = (I+1)*H+(H+1)*O = O +(I+O+1)*H < Ntrneq
Hub = -1 + ceil( (Ntrneq-O) / (I+O+1)) % 1
Nwub = O+(I+O+1)*Hub % 8 < 14
Hmax = 3
dH=1
Hmin =0
Ntrials = 20
MSEgoal = 0.01*MSEtrn00 % 2.4e-3 => R2trn >= 0.99
MinGrad = MSEgoal/10 % 2.4e-4
rng(0)
j=0
for h = Hmin:dH:Hmax
j=j+1
if h==0
net = newff(ptrn,ttrn,[]);
Nw = (I+1)*O
else
net = newff(ptrn,ttrn,h);
Nw = (I+1)*h+(h+1)*O
end
Ndof = Ntrneq-Nw
net.divideFcn = 'dividetrain';
net.trainParam.goal = MSEgoal;
net.trainParam.min_grad = MinGrad;
for i = 1:Ntrials
h = h
ntrial = i
net = configure(net,ptrn,ttrn);
[ net tr Ytrn ] = train(net,ptrn,ttrn);
ytrn = round(Ytrn)
MSEtrn = mse(ttrn-ytrn)
R2trn(i,j) = 1-MSEtrn/MSEtrn00;
Ytst = net(ptst)
ytst1(i,j) = round(Ytst(1));
ytst2(i,j) = round(Ytst(2));
end
end
H = Hmin:dH:Hmax
R2trn = R2trn
ytst1 = ytst1
ytst2 = ytst2
toc % 26 sec
% Training Summary:
1. R2trn > 0.71 only if the net is overfit (H=2, 3)
2. When R2trn > 0.71, R^2 = 1 (MEMORIZATION)
3. R2trn = 1 50% of the time when H = 2 and 90%
of the time when H = 3
4. When H=0 (linear), max(R2trn) = 0.71 25% of the time

5. When H =1, max(R2trn) = 0.42 60% of the time
% Generalization Summary
1. ytst(1) vs ttst(1)=1
When H =0:3, the corresponding number of errors are
[ 0 5 10 13 ]
2. ytst(2) vs ttst(2)=0
When H =0:3, the corresponding number of errors are
[ 11 17 14 12 ]