MATLAB: How to find the right neural network architecture

baysian regularisationDeep Learning Toolboxlevenberg-marquardt algorithmneural network

Good morning,

I am trying to learn how to use the neural network to fit functions. I did read a little bit into the subject but I am still not sure how to find the right architecture (the number of neuron in a hidden layer. I use networks with 1 hidden layer and my training algorithm are 'trainlm' and 'trainbr'. Currently I am aware of 4 problems that can occur:

+ Algorithm reaches a local minimum: the best training performance (tr.best_perf) is too large?

+ Overfitting: the best validation performance (tr.best_vperf) is much larger than the best training performance (tr.best_perf)?

+ Underfitting: the best validation performance (tr.best_vperf), the best training performance (tr.best_perf), the best test performance (tr.best_tperf) are in the similar size but they are still too large.

+ Extrapolating: the best test error (tr.best_tperf) is much larger than the two other ones.

Currently, I wrote a loop that examine networks with 1 neuron to 50 neurons. Each network (e.g. a network with 20 neurons) is trained for 10 times and the one with the lowest training performance (tr.best_perf) is chosen in order to avoid the local minimum. Afterwards, I store tr.best_tperf, tr.best_vperf and tr.best_perf of that network in a array. Finally I compare those 50 networks to each other and take the one with the lowest error, with error = max([tr.best_tperf, tr.best_vperf, tr.best_perf]).

The other way to go would be to train each network (e.g. a network with 20 neurons) for 10 times and choose the lowest error, with error = max([tr.best_tperf, tr.best_vperf, tr.best_perf]). Then I store this error for each network in a vector. Finally, I choose the network with the lowest element of that vector.

Can someone tell me which way is the correct way? I really appreciate any help you can provide.

1. Outer loop over # of hidden nodes Hmin:dH:Hmax with Hmax <= Hub, the upper bound for not having more unknown weights, Nw, than training equations Ntrneq. 2. Inner loop over Ntrials >= 10 different random distributions of initial weights.

Best Answer

Search the NEWSREADER and ANSWERS using

   fitnet Hmin Hmax Ntrials

Minimization of the number of hidden nodes subject to the MSEtrn upper bound

 MSEtrn <= 0.01*mean(var(targettrn',1))
        <= 0.01*var(targettrn,1) for 1-dim

this yields a training subset Rsquaretrn exceeding 0.99.

Many of the posts don't have the training subset subscript trn and/or may have used t instead of target. So, there are probably a jillion variations posted including

 MSEgoal = 0.01*vart1

The best way I have found to obtain relatively unbiased results is to use 2 loops.

Nets are initially ranked by their validation subset performance. Then unbiased estimates of performance are obtained from the test subset performance.

However, I usually rank the nets by their combined nontraining validation AND test subset performance.

Again, I have jillions of examples posted in the NEWSREADER and ANSWERS. The best search words are probably

        Hmin Hmax Ntrials

Hope this helps.

Thank you for formally accepting my answer

Greg

Best Answer

Related Solutions

MATLAB: How to train a feedfordward neural network with error weights

MATLAB: What does best_perf, best_tperf, best_vperf signify while using the train function

Related Question