No overfitting but bad prediction

classificationimage processingmachine learningneural networkspython

I classified some medical images. And distribution of the dataset is :

494 Train Anormal

469 Train Normal

37 Test Normal

64 Test Anormal

84 Val Anormal

37 Val Normal

…

My training result is (by ViT):

loss: 0.2714 – accuracy: 0.9102 – val_loss: 0.2624 – val_accuracy: 0.9196

and test result is:

               precision    recall    f1-score   support

           0       0.57      0.60      0.58        47
           1       0.46      0.43      0.44        37

    accuracy                           0.52        84

   macro avg       0.51      0.51      0.51        84

weighted avg       0.52      0.52      0.52        84

So my question is that test prediction is not good because of
imbalanced data? Or I should figure out something else?

I know 1000 images are not a nice thing in DL but I have to complete this training with them. Also, I implemented data augmentation.

Best Answer

There might be some difference between the training/validation and the testing data, causing performance gap. Another potential cause is that you tuned your model heavily on the validation set. This might make the model overfit to the validation set.

Best Answer

Related Solutions

Solved – Xgboost making bad predictions

Solved – How to improve the accuracy of an ARIMA model

Related Question