Solved – Dropout effectiveness on small neural networks

dropoutneural networks

I implemented dropout on my neural networks. I tried to train the neural network to act as the f(x) = sin(x) function. During normal backpropagation without dropout regularization, it literally needed less than 10 iterations to reach an extremely small error.

However, when I activated dropout, it got stuck at a reasonably high MSE error (0.05) and there was no improvement over time any more. I used two hidden units.

But asides from that, are there any examples of dropout regularization on small neural networks. Or any papers describing the effectiveness on small neural networks? That way I can test if my implementation was done correctly.

Best Answer

Dropout is when, with a certain probability, we remove units from the network entirely. If you're approximating a sine, then your $x$ is one-dimensional and your $f(x)$ is also one dimensional, correct? So if you remove one of your two hidden units, then you're trying to approximate a sine using the nonlinearity you've chosen, which can only ever be so good.

Generally speaking, dropout is better in situations when you can overfit more. The larger the network, the more expressive it is, and so one would expect it to be less useful in the case of a very small network where it is already hard to overfit.