Solved – When is a second hidden layer needed in feed forward neural networks

I'm using a feed forward neural network to approximate a function with 24 inputs, and 3 outputs. Most of the literature suggests that a single layer neural network with a sufficient number of hidden neurons will provide a good approximation for most problems, and that adding a second or third layer yields little benefit.

However I have optimized a single layer, and a multi-layer neural network and my multi-layer network is much better. For the single layer network I performed a sweep of 1 to 80 neurons, retraining the network each time, and plotting the performance. After about 30 neurons the performance converged. For the multi-layer network I used the genetic algorithm to select the number of neurons in the first and second layer. This resulted in a much better performance.

What I would like to know is why this happened, considering that most of the literature suggests that 1 layer is enough. Is there a particular type of problem that requires more than one layer? Does this suggest that the function being approximated is discontinuous, not well-defined, jagged (not smooth), or all/ a mix of the above? Or does it suggest something else, or nothing at all? I know that when used for classification a multi-layer neural network can classify data that is not linearly separable, but I'm using the network for function approximation.

Best Answer

From a theoretical point of view you can approximate almost any function with one layer neural network.

There are some examples where a two layer neural network can approximate with a finite number of nodes functions that with a one layer neural network can be approximated only with an infinite number of neurons.

Try to increase the number of nodes in the one layer neural network or try to train your on layer neural network with another algorithm like PSO. It is often easy to fall in a local minimum.

Best Answer

Related Solutions

Solved – Output layer of artificial neural networks when learning non-linear functions with limited value range

Solved – Neural networks – how can I interpret what a hidden layer is doing to the data

Related Question