Solved – Why accuracy gradually increase then suddenly drop with dropout

conv-neural-networkcross entropydropoutmachine learningtensorflow

I am building an image classification network in tensorflow(several convolutional layers and fully connected layers, then softmax cross entropy, optimize using Adam with a learning rate of 1e-4).

Without dropout, I can get a pretty good performance. Though the loss is still very high, even the in-sample error approaches zero.

With dropout (dropout rate less than 0.25), the accuracy will gradually increase and loss will gradually decrease first. Then, accuracy will suddenly drop to 1/(# of class), and loss will also stay close to a small constant.

Does anyone know why that might happen? how I can solve this issue? Thank you

Best Answer

When you increase dropout beyond a certain threshold, it results in the model not being able to fit properly. Intuitively, a higher dropout rate would result in a higher variance to some of the layers, which also degrades training. Dropout is like all other forms of regularization in that it reduces model capacity. If you reduce the capacity too much, it is sure that you will get bad results.

The solution is to not use such high dropout. If you must, lowering the learning rate and using higher momentum may help.

Furthermore, be careful where you use dropout. It is usually ineffective in the convolutional layers, and very harmful to use right before the softmax layer.

Related Question