Solved – Comparing SVM Model

classificationensemble learningsvm

Pardon my understanding of SVMs as it is very little.

We often hear of ensemble classifiers and stuff like this.

Say if i were to have 3 different SVM Models for the same dataset predicting a feature.

A test data point might have these results
Model A: Class 1
Model B: Class 2
Model C: Class 3

However, how would i know which class label should be the correct one given a test data point? It is often talked about that each Model will assign a score to its result and the Class label with the highest score would be the one chosen.

However, as far as i understand, SVM output would be a result of either 1 or -1. A binary SVM so as to say. Pardon my weak explanation of the question as i am not too sure how should i be asking it.

My Focus would be to ask how does an SVM model even assign a score to its prediction

Best Answer

The output of an SVM classifier is actually determined as $\hat y = \mathrm{sign}(\langle w, x\rangle + b)$, where $x$ is the test point and the model is defined by $w$ (the same kind of object as $x$) and $b\in \mathbb R$. The value $\langle w, x\rangle + b$ can be treated as a confidence score for the prediction; high absolute values correspond to confident predictions. This is sometimes called the "decision function", and can be useful to compare different examples with the same model. With multiple models, however, $w$ might vary substantially in norm (particularly if your models have different regularization parameters $C$), and so the magnitude of the scores can have very different meanings for different models; you can't necessarily compare them across models.

Platt scaling, as used by libsvm and other tools, converts this score (which can be any real number) to a probability (in $[0, 1]$) of being in the positive class. It does this by passing through a logistic function $\sigma(\langle w, x\rangle + b)$, with $\sigma(z) = 1 / (1 + \exp(A z + B))$. $A$ and $B$ are usually determined by cross-validation on the training set. Being probabilities, these are much more combinable/comparable/amenable to throwing into Bayesian models/whatever. These probabilities won't be completely reliable, but they're often at least sort of reasonable. Then you can use your favorite ensemble approach to make a final prediction.