Good morning,
I am trying to learn how to use the neural network to fit functions. I did read a little bit into the subject but I am still not sure how to find the right architecture (the number of neuron in a hidden layer. I use networks with 1 hidden layer and my training algorithm are 'trainlm' and 'trainbr'. Currently I am aware of 4 problems that can occur:
+ Algorithm reaches a local minimum: the best training performance (tr.best_perf) is too large?
+ Overfitting: the best validation performance (tr.best_vperf) is much larger than the best training performance (tr.best_perf)?
+ Underfitting: the best validation performance (tr.best_vperf), the best training performance (tr.best_perf), the best test performance (tr.best_tperf) are in the similar size but they are still too large.
+ Extrapolating: the best test error (tr.best_tperf) is much larger than the two other ones.
Currently, I wrote a loop that examine networks with 1 neuron to 50 neurons. Each network (e.g. a network with 20 neurons) is trained for 10 times and the one with the lowest training performance (tr.best_perf) is chosen in order to avoid the local minimum. Afterwards, I store tr.best_tperf, tr.best_vperf and tr.best_perf of that network in a array. Finally I compare those 50 networks to each other and take the one with the lowest error, with error = max([tr.best_tperf, tr.best_vperf, tr.best_perf]).
The other way to go would be to train each network (e.g. a network with 20 neurons) for 10 times and choose the lowest error, with error = max([tr.best_tperf, tr.best_vperf, tr.best_perf]). Then I store this error for each network in a vector. Finally, I choose the network with the lowest element of that vector.
Can someone tell me which way is the correct way? I really appreciate any help you can provide.
Best Answer