What I'm doing now is ranking each model within each metric and summing the ranks. Whichever model has the lowest sum I am considering the best.
Perhaps it would be clearer if I used an example
There are 5 models. Model A, Model B, and Model C, Model D, and Model E. There are 3 evaluation metrics. A rank of 1 is the best.
I rank the models by each eval metric
model | eval metric #1 rank | eval metric #2 rank | eval metric #3 rank |
---|---|---|---|
Model A | 4 | 3 | 4 |
Model B | 5 | 2 | 2 |
Model C | 1 | 1 | 5 |
Model D | 3 | 4 | 1 |
Model E | 2 | 5 | 3 |
The sum of each evaluation metrics rank is
model | sum of rank |
---|---|
Model A | 11 |
Model B | 9 |
Model C | 7 |
Model D | 8 |
Model E | 10 |
In this example Model C has the lowest sum and would be considered the best model.
Does this process have a name? I'm having trouble searching google for a better solution.
Best Answer
U could try to use Critical difference diagram to compare ML classifiers. Here is the details: https://www.jmlr.org/papers/volume7/demsar06a/demsar06a.pdf