Solved – Overfitting in object detection

conv-neural-networkmachine learningobject detectionoverfitting

I've heard different definitions of overfitting. One of them is "if the training error continues to decrease but testing error begins to increase at some point, it is overfitting". The other, "if the training mAP is way higher than testing mAP it is overfitting". I've even seen a definition saying "test score < train score is overfitting". I'm assuming score refers to accuracy.

Which one of them actually applies specifically to object detection?

I encountered a situation where both my train mAP and test mAP increases (test mAP – very slightly, sometimes fluctuate a little but no downward trend). But the difference between the two can increase by quite a bit i.e. train mAP is increasing at a faster rate than test mAP is. I've sometimes seen a big difference of around 20%.

Should I train for more epochs until I begin to see a downward trend in test mAP? Will there always be such a trend? Could I adjust any parameters such as learning rate so that I can see this trend faster?

*note: mAP values are averaged across a 5-fold CV

Sample of my mAP values (mAP vs iterations):
mAP vs iterations

Best Answer

The common definition from Wikipedia is:

In statistics and machine learning, one of the most common tasks is to fit a "model" to a set of training data, so as to be able to make reliable predictions on general untrained data. In overfitting, a statistical model describes random error or noise instead of the underlying relationship.

So, overfitting is just sensitivity to a random noise. The good video about what it is: https://www.youtube.com/watch?v=u73PU6Qwl1I.

When talking about object detection it's best to conclude about overfitting from the number (portion) of misdetected examples, MSE is also popular as well. But mean average precision or let's say F2-score are ok too - it's just the curve itself might look kind of different.


About the plot, I'd refer you to DeepLearning book for more info. There is a technique called early stopping meant to prevent overfitting. So you should worry only when validation error is starting to increase and your mAP is around value 56. Early-stoping recommendations are applicable to your case.

So, you're not overfitting much yet.

A better approach is to save some intermediate weights now and then, so you could roll-back to the state where cv-error is lowest.

enter image description here

The real curve might look like this (especially when you're using cross-validation):

enter image description here

Source: http://www.byclb.com/TR/Tutorials/neural_networks/ch10_1.htm


If you decrease learning rate (bigger step) - it could make learning faster, but depending on your data your model might skip local/global minima, so it might take even more time. The overall recommendation is to try different rates like 0.1, 0.01, 0.001.

Actually increasing learning rate (smaller step) might help your model to find more optimal values. If you use SGD, you can make the process quite faster by applying momentum.

Hyper-parameters you might also try to change to make it faster

  • less neurons in hidden layer (less width), more hidden layers (more depth)
  • for CNN - reduce size of receptive field
  • for CNN - apply pooling

Keep in mind, that it all comes at a risk of loosing important details about input.