The best approach for regression is to start with FITNET using as many defaults as possible. The default I-H-O node topology contains Nw = (I+1)*H+(H+1)*O unknown weights. Ntrn training examples yields Ntrneq = Ntrn*O training equations with Ntrndof = Ntrneq-Nw training degrees of freedom. The average variance in the training target examples is MSEtrn00 = mean(var(target')). Obtaining a mean-square-error lower than MSEtrngoal = 0.01*Ntrndof*MSEtrn00a/Ntrneq for Ntrndof > 0 results in a normalized DOF adjusted MSE of NMSEtrna <= 0.01 and the corresponding adjusted training Rsquared R2trna = 1-NMSEtrna >= 0.99. That is interpreted as the successful modeling of at least 99% of the variation in the target.
The training objective is to try and minimize H with the constraint R2trna >=0.99. This is usually achieved by trial and error over a double for loop with the outer loop of hidden node candidate values h = Hmin:dH:Hmax and an inner loop of i = 1:Ntrials random weight initializations. I have posted many, many examples. Search NEWSGROUP and ANSWERS using
If Ntrneq < ~2*Nw, validation stopping and/or regularization should be used to mitigate the problem of overtraining and overfit net.
The best approach to avoid overtraining is to use BOTH validation set stopping AND regularization.
HOWEVER FOR SOME STRANGE REASON, using validation stopping with TRAINBR is NOT AVAILABLE IN THE NNTOOLBOX !!!
Your choice of TRAINBR instead of FITNET is not wrong. However, you have made numerous errors, especially by not accepting as many defaults as possible.
Why not just use the syntax in
with the double loop approach?
Don't forget to initialize the RNG before the first loop so that you can duplicate results.
Hope this helps.
Thank you for formally accepting my answer
Greg
Best Answer