Machine Learning – Can Test Error Be Lower Than Training Error in Python?

machine learningpython

Is it possible to have test error lower than training error?

I have a classification problem with 2000 samples, 500 of which are positives, 1500 are negatives. I split my data into 70% training data, 30% test data.

Run random forest with 200 estimators and cv=10. I did this several times and compared the recall and precision score and notice the scores for my test set is significantly better. Is this possible?

Best Answer

Totally possible, though it probably means that you aren't training quite as much as you could be. Typically when you look at test/train accuracies over time you get a graph like this:

Credits go to Daniel Nee

The test/train stages can be (very broadly) categorized as follows:

first you start training and the test/train accuracy is noisier, but they are very strongly correlated. This means you haven't quite fit to the problem.
As time goes on, they both start to decrease, but the training error starts to decrease more quickly than the testing error. This means you're approaching a very good level of fit.
Eventually you start to see the error rate of the testing set increase, while the training set error continues to decrease. This means you have officially started to overfit.

There are a lot of ways of dealing with overfitting if that becomes a problem, but your goal in picking an algorithm and training should be to hit the highest accuracy, which typically happens somewhere in the second stage.

If your test accuracy is higher than your train accuracy, you are likely still very far left on the training graph. There are three main options for resolving that problem:

use an algorithm better suited for small datasets (hard to tell without knowing about your problem, but Naive Bayes is usually a good small data choice)
Change your model constants to fit more strongly to your training set (increasing the learning rate)
Get more data

Best Answer

Related Solutions

Machine Learning – How to Calculate Precision and Recall in Machine Learning

Evaluation Measures – Should Decisions Be Based on Micro-Averaged or Macro-Averaged Measures?

Related Question