Solved – Heuristic for choosing neural network size (number of hidden units/layers)

neural networks

Is there an optimal heuristic to set the size of a feedforward neural network?

I am using one for closed set speaker recognition. The input are 24 MFCC coefficients (cepstral audio representation) and the output layer is a softmax with 331 speaker classes as targets. The network classifies 20ms of speech at a time and then aggregates the score over a whole audio speech file to predict the speaker identity.

Right now I am using a single layer with 1000 hidden units and it works quite well, but I would like to understand if there is a way to know how many hidden units I can add before the complexity is too much?
Is there a way of knowing if having more than one hidden layer is really useful?

Right now I have [24in 1000 331out]. Could I go to [24in 10000 331out]? Or would [24in 1000 1000 331out] be better?
Is there an heuristic to have some intuition on this?

Also how many training samples per parameter are usually needed to fit the network properly?`My [24in 1000 331out] network has 64002 parameters and I am training it with 936280 audio frames as examples, so roughly 14 training examples per parameter. Is there an heuristic to know when this ratio gets too little or what is optimal?

Best Answer

In principle, there is no limit on the number of hidden layers that can be used in an artificial neural network. Such networks can be trained using "stacking" or other techniques from the deep learning literature. Yes, you could have 1000 layers, though I don't know if you'd get much benefit: in deep learning I've more typically seen somewhere between 1-20 hidden layers, rather than 1000 hidden layers. In practice the number of layers is based upon pragmatic concerns, e.g., what will lead to good accuracy with reasonable training time and without overfitting.

Best Answer

Related Solutions

Solved – Balancing classes for Neural Network training

Solved – Confused in selecting the number of hidden layers and neurons in an MLP for a binary classification problem

Related Question