Solved – One-vs-many/One-vs-all – what value to use as probability

classificationlibsvmmachine learningmulti-classsvm

I have constructed SVMs to do a one-vs-many approach to classification. Let's say I have 3 classes and I train 3 SVMs in a one-vs-many format. This gives me 3 SVMs each trained positively on one of a class {a,b,c} and trained negatively on the remaining data.

When testing a test sample of class a, I may get results looking like:

class a as positive SVM:
a: 0.6 neg: 0.4
class b as positive SVM:
b: 0.1 neg: 0.9
class c as positive SVM:
c: 0.2 neg: 0.8

Clearly the sample does belong to class a. I wish to use a probability however. I'm wondering what to use for this? If I use the highest probability, then it is not necessarily very high. We could has class a SVM giving probability of 0.2 and the two others as 0.0001, using 0.2 as the probability doesn't seem relative. Is there a way to get a probability using the one-vs-all technique that factors this information in? If not, then thresholds of, e.g.: 0.6, on acceptance of a class many not be met by any score and nothing may be over the threshold enough to say this sample belongs with this class.

Best Answer

Here's what I would recommend: Use probability rankings and class proportions in the training sample to determine the class assignments.

You have three (estimated) probabilities: $p_a, p_b,$ and $p_c$. And you have the original class proportions from the training sample: $m_a, m_b,$ and $m_c$, where $m_a$ is the percentage of classes that belong to class $a$ (e.g., 0.6), and so on.

You can start with the smallest class, say $b$, and use $p_b$ to rank order all records from the highest to lowest values. From this rank-ordered list, start assigning each record to class $b$ until you have $m_b$ percent records assigned to this class. Record the value for $p_b$ at this stage; this value will become the cut-off point for class $b$.

Now take the next smallest class, say $c$, and use $p_c$ to rank order all records and follow the same steps described in the paragraph above. At the end of this step, you will get a cut-off value for $p_c$, and $m_c$ percent of all records would be assigned to class $c$.

Finally, assign all remaining records to (the largest) class $a$.

For future scoring purposes, you can follow these steps but discard the class proportions. You can let the probability cut-off values for class $b$ and $c$ to drive class assignments.

In order to make sure that this approach yields a reasonable level of accuracy, you can review the classification matrix (and any other measures you are using) on the validation set.