Support Vector Machine – How to Calculate ‘w’ by Hand

classificationmachine learningscikit learnsvm

I am working on a trivial example of SVM to gain some intuition behind the way it works. In the case of 6 data points, would it be possible to calculate the value of $w$ and $b$ by hand ? Intuitively I am able to get the correct values up to a scaling factor only.

Using Sklearn with a linear kernel the correct values are

  • $w = (\frac{1}{4}, – \frac{1}{4})^T$
  • $b = – \frac{3}{4}$

Intuitively I tried different values:

  • $w = (1, – 1)^T$ and $b = – 3$ which comes from the straightforward equation of the line $x_2 = x_1 – 3$. This gives the correct decision boundary and geometric margin $2\sqrt{2}$
  • $w = (\frac{1}{\sqrt{2}}, – \frac{1}{\sqrt{2}})^T$ and $b = – \frac{3}{\sqrt{2}}$ which ensures that $||w|| = 1$ but doesn't get me much further

Example

SVM simple

Dataset and Sklearn result

import numpy as np
from sklearn.svm import SVC

X = np.array([[3,4],[1,4],[2,3],[6,-1],[7,-1],[5,-3]] )
y = np.array([-1,-1, -1, 1, 1 , 1 ])

clf = SVC(C = 1e5, kernel = 'linear')
clf.fit(X, y) 

Best Answer

Solving the SVM problem by inspection

By inspection we can see that the boundary decision line is the function $x_2 = x_1 - 3$. Using the formula $w^T x + b = 0$ we can obtain a first guess of the parameters as

$$ w = [1,-1] \ \ b = -3$$

Using these values we would obtain the following width between the support vectors: $\frac{2}{\sqrt{2}} = \sqrt{2}$. Again by inspection we see that the width between the support vectors is in fact of length $4 \sqrt{2}$ meaning that these values are incorrect.

Recall that scaling the boundary by a factor of $c$ does not change the boundary line, hence we can generalize the equation as

$$ cx_1 - xc_2 - 3c = 0$$ $$ w = [c,-c] \ \ b = -3c$$

Plugging back into the equation for the width we get

\begin{aligned} \frac{2}{||w||} & = 4 \sqrt{2} \\ \frac{2}{\sqrt{2}c} & = 4 \sqrt{2} \\ c = \frac{1}{4} \end{aligned}

Hence the parameters are in fact $$ w = [\frac{1}{4},-\frac{1}{4}] \ \ b = -\frac{3}{4}$$

To find the values of $\alpha_i$ we can use the following two constraints which come from the dual problem:

$$ w = \sum_i^m \alpha_i y^{(i)} x^{(i)} $$ $$\sum_i^m \alpha_i y^{(i)} = 0 $$

And using the fact that $\alpha_i \geq 0$ for support vectors only (i.e. 3 vectors in this case) we obtain the system of simultaneous linear equations: \begin{aligned} \begin{bmatrix} 6 \alpha_1 - 2 \alpha_2 - 3 \alpha_3 \\ -1 \alpha_1 - 3 \alpha_2 - 4 \alpha_3 \\ 1 \alpha_1 - 2 \alpha_2 - 1 \alpha_3 \end{bmatrix} & = \begin{bmatrix} 1/4 \\ -1/4 \\ 0 \end{bmatrix} \\ \alpha & = \begin{bmatrix} 1/16 \\ 1/16 \\ 0 \end{bmatrix} \end{aligned}

Source

Related Question