MATLAB: Feedforward Network and Backpropagation

backpropagationbinary outputDeep Learning Toolboxfeedforward neural networktutorial

Hi,
I've very new to Matlab and Neural Networks. I've done a fair amount of reading (neural network faq, matlab userguide, LeCunn, Hagan, various others) and feel like I have some grasp of the concepts – now I'm trying to get the practical side down. I am working through a neural net example/tutorial very similar to the Cancer Detection MATLAB example (<http://www.mathworks.co.uk/help/nnet/examples/cancer-detection.html?prodcode=NN&language=en)>. In my case I am trying to achived a 16 feature set binary classification and am evaluating the effect on training and generalisation of varying the number of nodes in the single hidden layer. For reference below, x (double) is my feature set variable and t is my target vector (binary), training sample size is 200 and test sample size is approx 3700.
My questions are: 1) I'm using patternnet default 'tansig' in both the hidden and output layers with 'mapminmax' and 'trainlm'. I'm interpretting the output by thresholding on y . 0>=0.5 The matlab userguide suggests using 'logsig' for constrained output to [0 1]. Should I change the output layer transfer function to 'logsig' or not ? I've read some conflicting suggestion with regard to doing this and that 'softmax' is sometimes suggested, but can't be used for training without configuring your own derivative function (which I don't feel confident in doing).
2) The tutorial provides a training and test dataset, directing the use of the full training set in training (i.e. dividetrain) and at the same time directs stopping training once the network achieves x% success in classifying patterns. a) is this an achieveable goal without a validation set, or are these conflicting directions? b) if achievable, how do I set 'trainParam.goal' to evaluate at x% success ? Webcrawling has led me to the answer of setting preformFcn = 'mse' and trainParam.goal = (1-x%)var(t) – does this make sense (it's seems to rely on mse = var(err) )? c) Assuming my intuition above is correct – is there an automated way of applying cross validation to a nn in matlab or will I effectively have to program in a loop? e) is there any point to this or does would a simple dividerand(200, 0.8, 0.2, 0.0) acheive the same thing ?
3) Is there an automated way in the nntoolbox of establishing the optimum number of nodes in the hidden layer ?
Thanks in advance for any and all help

Best Answer

Use the default trn/val/tst ratio 0.7/0.15/0.15 or choose 0.6 <= Ntrn/N <= 0.7 with Ntst = Nval.
[ I N ] = size( x ) % [ 16 3900 ]
[ O N ] = size(t) % [ 1 3900 ]
Ntst = round(0.15*N) % 2730
Nval = Ntst % 585

Ntrn = N-Nval-Ntst % 585
% My questions are: 1) I'm using patternnet default 'tansig' in both the hidden and output layers with 'mapminmax' and 'trainlm'. I'm interpretting the output by thresholding on y . 0>=0.5
UGH.
1. That doesn't make any sense since the range of tansig is (-1,1).
2. The advice in help/doc/type patternnet all conflict
3. Use { 'tansig' logsig } with mapstd or mapminmax(the default) inputs
4. Use 'trainscg' for unipolar binary targets {0,1}
5. class = 1+ round(net(x)). Can modify if have unequal priors and/or
unequal misclassification costs.
6. Can use softmax for more than 2 classes. MATLAB now has the
derivative for softmax
7. Use a val set with round(0.15*N) <= Nval = Ntst <= round(0.2*N)
8. Use MSEgoal = max( 0, 0.01*Ndof*MSE00a/Ntrneq)
a. MSE00a = mean(var(t'))
b. Ntrneq = round(0.7*prodsize(t)) % ~ No. of training equations
c. Ndof = Ntrneq-Nw % No. of estimation degrees-of-freedom(See Wikipedia)
d. Nw = (I+1)*H+(H+1)*O % No. of unknown weights to estimate
9. Stopping on any misclassification rate cannot be done unless
a. Either the training is broken up into a loop of few epochs at a time
with breaks to check the classification rate
b. Or patternnet is modified
c. It's not worth the time(a) or effort (b).
d. If you disagree, please send me a copy of your code.
10. Dividetrain is only useful if Ntrn >> Nw and the generalization error
is estimated using the DOF adjusted value Ntrneq*MSE/Ndof.
Unfortunately, MATLAB does not allow Nval = 0, Ntst > 0. The closest
fudge that I can think of is Nval = 1 (ratio = 1/N) , max_fail = inf.
11. I find a good value for the number of hidden nodes, H, by using an
outer loop over j = Hmin:dH:Hmax and an inner loop over random weight
intializations i = 1:Ntrials with Ntrials ~ 10 and
Hmax <= Hub = -1+ceil( (Ntrneq-O)/(I+O+1))
12. Nw > Ntrneq and Ndof < 0 when H > Hmax.
Hope this helps.
Thank you for formally accepting my answer
Greg
PS I have many examples in comp.ai.neural-nets and comp.soft-sys.matlab.