Support Vector Machine and Hessian Normal Form

vectors

I'm using a Support Vector Machine from Scikit Learn.

Example data:

[[1, 1], 
 [2, 2],  
 [3, 3], ...]

My code:

clf = svm.SVC(kernel='linear')
clf.fit(x, y)

This gives me:

n = clf.coef_[0]
d = clf.intercept_[0]

which I thought to be $ n_0 $ and $ d $ of the hessian normal form see:
$$
{\displaystyle {\vec {x}}\cdot {\vec {n}}_{0}-d=0}
$$

But when I plot the seperating hyperplane of my SVM something is wrong. The line has the right slope but the wrong intercept.

It seems like the formula has to be:

$$
{\displaystyle {\vec {x}}\cdot {\vec {n}}_{0}+d=0}
$$

What am I missing here?

Best Answer

I believe this is just pure convention regarding what Scikit Learn means by the attribute "intercept". The implementation from Scikit Learn is based on libsvm, and I found this handy guide on the libsvm webpage:

https://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf

In equation 1 of that guide they write down the optimization problem for SVMs, which I'll translate to the linear case since that's what you're working with; I'll also use your $\vec{n}_0, d$ notation. Supposing you have $l$ instance-label pairs $(\vec{x}_i,y_i)_{i=1}^l$, where $\vec{x}_i \in \mathbb{R}^n$ and $y_i \in \{-1,1\}$, you want to solve

$$\begin{align} \min_{\vec{n}, d, \vec{\xi}} &\frac{1}{2} \vec{n}\cdot \vec{n} + C \sum_{i=1}^l \xi_i,\\ \text{subject to } & y_i(\vec{n}\cdot \vec{x}_i + d) \geq 1-\xi_i,\\ & \xi_i \geq 0 \end{align}$$

The first constraint is the relevant one for your classification. Allowing yourself slack $\xi_i$ for each example, you want the sign of $y_i$ to be the same as $\vec{n}\cdot \vec{x}_i + d$. So that means you want all the examples of class $1$ to satisfy $\vec{n}\cdot \vec{x}_i + d >0$, and all examples of class $-1$ to satisfy $\vec{n}\cdot \vec{x}_i + d <0$.

So, the decision plane is $\vec{n}\cdot \vec{x}_i + d = 0$, as you've guessed.

Related Question