Solved – Understanding the One-Vs-The-Rest classifier

classificationsvm

Introduction

I am working on a multiclass classification problem by using the One-Vs-The-Rest classifier. I want to check if my understanding of the classifier is correct.

The One-Vs-The-Rest classifier strategy consists in fitting one binary classifier per class. We associate a set of positive examples for a given class and a set of negative examples which represent all the other classes.

For the training step, I don't want to use all the other classes as negative examples. I want to avoid the problem of unbalanced data.

In order to represent the negative examples I select a subset of examples which size is represented by: the size of the set of positive examples plus $k$ number of negative examples. For instasnce, if the size of the set positive examples is $10$ then the size of the set of negative examples is $k + 10=20$.


Example

I'll try to use an example in here, just to see if my understanding is correct. Suppose that I have a set of $50$ observations divided into $3$ classes. I'll use the same logic as before for selecting the negative examples.

Training k = 10

  1. Classifier 1: has $20$ positive examples and $30$ negative examples
  2. Classifier 2: has $16$ positive examples and $26$ negative examples
  3. Classifier 3: has $14$ positive examples and $24$ negative examples

Testing

My greatest misunderstanding lies in the testing step.

Suppose that I have a new instance that I want to classify into $1$ of the previous classes.
Each of the three classifiers produces a probability. which is an estimation of how far of low the instance belongs to the negative or positive examples in the classifier.

  • If the instance was classified as a negative example for the $3$ classifiers therefore it doesn't belong to any of them is that right? or else should we always compare the probability estimates of the positive examples even if they are lower than the probabilities of the negative ones in the classifiers?

Best Answer

We should always compare the probabilities of positive class in one vs all. For m classes, we will have m probability estimates of positive class for one test sample. Compare them and whichever probability is maximum of m probabilities, it belongs to that particular class.

Related Question