I have a dataset which I split as 80% training and %20 validation sets. (38140 images for training, 9520 for validation) Model that I train is a deeper (~45 layers) convolutional neural network.
I got the below results in the first epochs of training:
Epoch 1: train loss: 1041.52 - validation loss: 1045.89
Epoch 2: train loss: 750.78 - validation loss: 749.95
Epoch 3: train loss: 425.88 - validation loss: 423.35
Epoch 4: train loss: 320.29 - validation loss: 319.35
Epoch 5: train loss: 305.41 - validation loss: 305.07
As can be seen, after first epoch the validation error is slightly lower than training loss. Is it something that I worry or Is it an indicator of good convergence and generalization?
Best Answer
In your case the difference is tiny (< 1%), I am quite sure, that this is no problem. The train set may contain more difficult images than the test set, therefore giving a higher loss.
I would interpret this example as having a good generalization without overfitting, plus a little random variation between training and test set.
For more possible reasons, you can check this excellent answer.