Solved – Why does cross entropy loss for validation dataset deteriorate far more than validation accuracy when a CNN is overfitting

conv-neural-networkcross entropyneural networksoverfitting

I have noticed that the cross-entropy loss for validation dataset deteriorates after a certain number of epochs when training CNN's or MLP's. This is, of course, the sign that the network is overfitting. But why don't we see a corresponding fall in the validation accuracy? Many times I have noticed that the validation accuracy keeps increasing or remains constant. Sometimes the cross-entropy loss deteriorates to a value higher than that of value at the start of training despite no significant deterioration in validation accuracy.

Best Answer

This can happen because classification accuracy is a discretized result. Consider a binary classification (e.g. human vs animal), i.e. output layer having one neuron, and you use a threshold, let's say 0.5. At some level, your model might produce results very distinctively for your validation set, which is a good point in your learning process, i.e. for human it gives values around 0-0.1, and for animal, it gives values around 0.9-1. Suppose it starts overfitting afterwards, these predictions might get close to 0.5 from each side, increasing your validation loss, while your validation accuracy stays the same. It might also increase some time, while your certain cases' certainty decreases. The case of being larger than the initial cross-entropy for the validation set is not impossible in this case, though a very special one.