Solved – Expected test error

cross-validationdeep learningmachine learningmathematical-statisticsoverfitting

According to Deep Learning (Ian Goodfellow and Yoshua Bengio and Aaron Courville,p.111, available online):

1.Assume that train and test set are indentically distributed(IID assumption).

2.Then:

''We sample the training set, then use it to choose the parameters to reduce training set error, then sample the test set. Under this process, the expected test error is greater than or equal to the expected value of training error"

Could you provide me with mathematical proof of this statement?

Best Answer

There is no such a proof. That's just an intuitive thing. Model typically predicts training samples better than test samples, because it learns from the training data and test data is just something that model hasn't seen before. It's possible that test error is lower than training error, especially in case if samples are small.