Solved – How to report the results of a cross-validation to a paper: Can I manually select the best results

cross-validationoversampling

Good morning. With regard to the interpretation of the results of the N-fold cross-validation with over-sampling (SMOTE), I raise the question to write up the paper.

Currently, I have performed 5-fold cross validation by randomly shuffling the patients for a model that predicts the patient's prognosis (good vs. poor) and have thus obtained five AUCs. The average AUC for this was derived to be around 0.90, but my concern is that whenever I perform 5-fold cross-validation and over-sampling, the subject used for training/testing also varies randomly, with an average AUC of 0.85 to 0.96.

In this case, when writing a paper, I wonder if there is a problem in selecting a case where the mean AUC is the best. (In the case of related papers I have literature review, they had simply passed the "cross-validation showed the stable result (AUC ** – **).")

It would be very helpful if you let me know.

Best Answer

There is definitely a problem with selecting a case where the mean AUC is the best. You should instead report how you set up cross-validation, how many times you ran it, and include some summary statistics of the AUCs you obtained (maybe include a histogram, too).

Cross-validation gives you an estimate of how your model would perform if you train it on a random sample from your distribution (of a similar size to your training folds) on another random sample from your distribution. The variability in AUCs you observe, depending on which examples make it into the training/test sets, shows that your model is somewhat sensitive to your sample. The variance in AUCs gives you a sense of how sensitive it is.

To show why selecting a case with the best AUC is wrong, consider a case where your model is extremely sensitive to your training/test sets. It sounds like a bad model, right? But given the wide variance, on some sample if will work really, really well - by chance. You can then see how reporting just that figure would be really misleading.

Best Answer

Related Solutions

Solved – How to compute F-measure and accuracy for repeated cross-validation

Cross Validation – The Appropriate Way to Get Cross-Validated Performance Metrics

Related Question