Solved – Comparison of machine learning algorithms

machine learningmathematical-statisticssensitivity-specificitystatistical significance

Suppose that I have taken 8 machine learning algorithms which are used by researchers most frequently. I have applied these 8 machine learning algorithms over 8 datasets which are publicly available on internet.

I get results like:
Random forest works well on 1 dataset.
SVM performs better on 2 dataset.
How can I conclude which machine learning algorithm among all performs best.

Thanks in advance..

Best Answer

For classification algorithms this would be a good start: Statistical Comparisons of Classifiers over Multiple Data Sets.

To summarize this excellent paper: perform a Friedman test to determine if there is any significant difference between the classifiers and follow-up with an appropriate post-hoc test if there is:

  • to compare all classifiers: Nemenyi test
  • to compare one with all others: Bonferroni-Dunn test

Both post-hoc tests can be visualized neatly in so-called critical difference diagrams.

Related Question