Solved – Overfitting issues with convolutional neural network classifier

gradient descentneural networksoptimizationoverfittingtensorflow

I'm working on a classifier that uses a convolutional neural network. As part of this, the AdamOptimizer is used during gradient descent.

When I examine the results of training and testing, I'm finding significant overfitting given that the data available tends to be concentrated in several classes.

How can I reduce overfitting here? Ideas I've had, that I'm not sure are sound, or how they would work:

  • Change training data to make it more evenly distributed across classes
  • Modify the loss function in some way
  • Change the hyperparameters (learning rate?) for the Adam Optimizer

Best Answer

Have you tried adding dropout regularization? This is essentially removing a certain percentage of the neurons on every training iteration, resulting in a network less sensitive to the activations of specific neurons, therefore less fragile and sensitive to details in images in the training set.

It's perhaps explained better here: https://www.tensorflow.org/tutorials/layers. In the case of TensorFlow it's just a layer that you can stick between two arbitrary layers.