Solved – how does a linear svm classifer work

linearpredictionsvm

I have been checking about SVMs in particular linear SVMs throughout many questions here. However, one problem i faced is that there seems to be no indepth explanation on how does linear SVM works in terms of equations.

I understand that a linear SVM is actually a set of super long equation.

For this case.

Simply consider a 2 class problem : A and B.

Suppose my linear SVM would be an equation of

y – 2x + 7 = 0

In which case do i assign the point (2,3) to class A or class B. What would be the determining factor. Or am i totally missing the point in the question. What i would like to see is how the scores are computed and which class would the data point fall into. Simple explainations would be to have -1 to class A and 1 to class B or vice versa. However, the values usually do not exactly come up to 1 or – 1

I am not a mathematically inclined person. So i do not understand the very confusing and difficult equations presented in the papers

Best Answer

You seem to have forgotten the simplest aspect of an SVM classifier: you basically look at which side of the separating hyperplane you are. The equation you have shown is a little confusing in the sense that often $y$ denotes the label, which we do not use in predictions of course.

This is equivalent to looking at $sign(\mathbf{w}^T\mathbf{x} + b)$, where $\mathbf{w}$ and the bias $b$ are part of the SVM model. $\mathbf{w}$ is the normal vector to the separating hyperplane.

The dot product of a test vector $\mathbf{x}$ and the normal vector $\mathbf{w}$ is (a scaled version of) the scalar projection of $\mathbf{x}$ onto $\mathbf{w}$; which is equivalent to (a scaled version of) the signed distance to the separating hyperplane.

Suppose you have a model like what you described: $\mathbf{w}=[1, -2]$ and $b=7$. To draw the separating hyperplane (which is a line in 2-D), you solve the equation you have mentioned (I replaced $y$ by $x_1$ and $x$ by $x_2$): $$w_1x_1 + w_2 x_2 + b = 0 = x_1 - 2 x_2 + 7$$ The label for a test instance $[x_1, x_2]$ is then: $$ \left\{\begin{matrix} +1 & \text{ if } x_1 - 2 x_2 + 7 > 0 \\ -1 & \text{otherwise}\hfill \end{matrix} \right. $$

Related Question