Solved – different trends in loss and AUC ROC metric

deep learningloss-functionsmetricneural networks

I am training a deep neural network for a binary classification
I am using binary_crossentropy as loss and area under the roc curve as performance metric as requested by specific data domain I am working on.
I train on a small and noisy dataset for some epochs, and get the following:

So, my validation loss increases at some (point (log shown here)), while the auc increases at the same epochs after 125)
How to interpret this finding?Is this overfitingt?
Is my domain specific performance evaluation still realistic?

Best Answer

The curve shows symptoms of the overfitting. Even though your ROC AUC score increased during the training it's still quite low compare to the score that you get from the training data. Your training ROC AUC near 0.85 and for validation it's 0.6. That's a very big difference.

Also, I think that during the training probabilities for the prediction are getting closer and closer to the 0.5 (or maybe some other constant value, you can check it very easily). This can explain increase in the binary cross-entropy. The ROC AUC score does not depend on the probabilities, the rank of the prediction is more important.

Here is one simple example

import numpy as np
from sklearn.metrics import roc_auc_score
from sklearn.metrics import log_loss as binary_crossentropy

actual = np.array([0, 1, 1, 1])
prediction1 = np.array([0.5, 0.9, 0.9, 0.5])
prediction2 = np.array([0.499, 0.501, 0.501, 0.501])

print("Prediction #1")
print("Binary cross-entropy: {:.3f}".format(binary_crossentropy(actual, prediction1)))
print("ROC AUC score: {:.3f}".format(roc_auc_score(actual, prediction1)))
print("")
print("Prediction #2")
print("Binary cross-entropy: {:.3f}".format(binary_crossentropy(actual, prediction2)))
print("ROC AUC score: {:.3f}".format(roc_auc_score(actual, prediction2)))

That's the output that you should see

Prediction #1
Binary cross-entropy: 0.399
ROC AUC score: 0.833

Prediction #2
Binary cross-entropy: 0.691
ROC AUC score: 1.000

It does look like second prediction is nearly random, but it has perfect ROC AUC score, because 0.5 threshold can perfectly separate two classes despite the fact that they are very close to each other. The only thing that's important is that if you order probabilities (rank them) from lowest to largest score they can be separated easily with single threshold.

Best Answer

Related Solutions

Solved – MSE as a proxy to Pearson’s Correlation in Regression Problems

K-Fold Cross Validation – How Many Epochs to Train For?

Related Question