Solved – probablistic output for binary SVM classification

libsvmsvm

I'm using libSVM for a binary classification problem. After a test instance is assigned a label (1 or -1), I also want to know how likely it is assigned such a label.

I'm thinking about calculating the distance from the instance to the hyperplane in the feature space. The larger the distance is, the more likely the label assignment is correct.

Is my idea correct? Or is there already such an option in libSVM for my purpose?

Best Answer

The usual approach is to use Platt's method of fitting a univariate logistic regression model to the output of the SVM. However, if you want a probabilistic output, it is probably better to go for kernel logistic regression, which estimates the probabilities directly, rather than training a discriminative classifier and post-processing the output.

Gaussian process classification would also be another method that may be better suited, see the excellent book by Rasmussen and Williams, and the equally excellent MATLAB toolbox that goes with it.