Solved – Transform multiclass classification to binary – benefits

binary dataclassificationmulti-class

I have 400 instances which must be categorized into 4 classes. Using WEKA, I tried out a couple of multiclass classifiers like J48 and Random Forests, but never made it above Kappa 0.6 and ~65% correctly classified instances (10-fold X-V)

Then I thought about transforming the problem into a 1-vs-all classification, which usually yields accuracies of ~90%. I would then remove the one "single" class and keep the merged ones. Then, again, having only instances with 3 classes, I would perform 1-vs-2 and remove the instances classified as belonging to the single class, ending up with a binary classification problem. As I said – I always have like 90% correctly classified instances, but I fear that the 10% incorrectly classified instances add up and propagate through the splitting and dataset reduction process —

so in the end I would maybe end up with the same garbage output I'd have when performing the original multiclass classification?! What's the stand on this approach? Does it have any benefits at all?

Best Answer

Translating a multiclass problem into a set of binary ones (using 1-vs-all or 1-vs-1) is typically done when you want to use algorithms that don't actually have a multiclass formulation, such as SVM.

If you do not plan to change the classification algorithm, you will probably end up with similar results after transforming your problem.

Note that changing algorithm will not necessarily improve your performance.

Related Question