Solved – Comparing multiclass classification algorithms for a particular application

classificationMATLABmethod-comparisonreferencessupervised learning

I am assessing a bunch of classification algorithms for a specific application with multiple classes. The classification algorithms that I am considering are:

  1. Multinomial Logistic Regression (Matlab's 'mnrfit')
  2. Multiclass SVM (K. Crammer and Y. Singer. On the Algorithmic Implementation of Multi-class SVMs, JMLR, 2001.). I will use the code provided by the authors since Matlab's 'svmtrain' only does binary classification.
  3. Neural Networks (Matlab's 'nprtool')
  4. Decision Trees (C4.5 and CART from Matlab's 'classregtree')
  5. k-Nearest Neighbors (Matlab's 'ClassificationKNN')
  6. Naive Bayes Classifier (Matlab's 'naiveBayes')
  7. Discriminant Analysis (Matlab's 'ClassificationDiscriminant')
  8. Random Forests (Matlab's 'TreeBagger')

I have the following questions:

  1. Have I omitted any "obvious" multiclass classification algorithm that's a must-try? Or, are there any binary classifiers that can easily be used for multiclass with one-vs-all method.
  2. Which of these are linear classifiers and which are non-linear classifiers? I know 3, 4 and 5 are non-linear by nature and 2 can be non-linear with the kernel trick.
  3. Which of these are discrete classifiers and which are probabilistic? For example, logistic regression gives a probability for each class, while decision trees give exactly one class. On a related note, if I want to use ROC curves for comparison, how do I compare a discrete classifier with a probabilistic one, as the former gives only a point on the ROC plot?
  4. Which of these are deterministic classifiers and which are stochastic? In other words, which classifiers will yield exactly the same results for multiple runs? This is important for me: For example in logistic regression I can be done with just one run, whereas for neural networks, I will have to train the net multiple times to avoid biased results.

Finally, are there any good papers which show how to compare various classification algorithms. I found this one which is quite good.

Best Answer

I am simply copy-pasting the answers I got from Alexandre Passos on Metaoptimize. It would really help if someone here can add more to it.

  1. Any binary classifier can be used for multiclass with the 1-vs-all reduction, or the all-vs-all reduction. This list seems to cover most of the common multiclass algorithms.
  2. Logistic regression and SVMs are linear (though SVMs are linear in kernel space). Neural networks, decision trees, and knn aren't lineasr. Naive bayes and discriminant analysis are linear. Random forests aren't linear.
  3. Logistic regression can give you calibrated probabilities. So can many SVM implementations (though it requires slightly different training). Neural networks can do that too, if using a right loss (softmax). Decision trees and KNN can be probabilistic, though are not particularly well calibrated. Naive bayes does not produce well calibrated probabilities, nor does the discriminant analysis. I'm not sure about random forests, depends on the implementation I think.
  4. All are deterministic except for neural networks and random forests.

Why do you want to compare different classification algorithms? Are you trying to decide which one is the best in general, or just for one application?

If the former, it's not worth doing it, as most claims are rather sketchy and there is no method which can give that kind of conclusion. If the latter, it is well accepted that cross-validation, or comparing performance on a fixed test-set, gives you unbiased results. For multiclass classification it is not always obvious which metric to use, but things like accuracy; per-class precision/recall/f1, per-class AUC, and the confusion matrix are commonly used.