Solved – What’s the hinge loss in SVM

machine learningsvm

If the data is not separable, we can minimize

$J = \frac{1}{2}\|w\|^2 + C\sum l_{0/1}(y_i (w^T x_i + b) – 1)$

here,

$l_{0/1}(z)=
\begin{cases}
1,& \text{if } z\lt 0\\
0,& \text{otherwise}
\end{cases}
$

In this plot, the green curve the $l_{0/1}$ loss and the blue one is the hinge loss

$l_{hinge}(z) = max(0, 1-z).$

We substitute $l_{0/1}$ loss with $l_{hinge}$ loss

$z = y_i (w^T x_i + b) – 1$

$1-z = 2 – y_i (w^T x_i + b).$

Therefore,

$J = \frac{1}{2}\|w\|^2 + C\sum max(0, 2 – y_i (w^T x_i + b))$

but the book says:

$J = \frac{1}{2}\|w\|^2 + C\sum max(0, 1 – y_i (w^T x_i + b))$

Why is "2" changed to "1"?

The number of miss classified points is

$l_{0/1}(y_i(w^T x_i + b))$

not

$l_{0/1}(y_i(w^T x_i + b)-1)$ Note that the separate plane is in the middle($w^T x - b = 0$), not the "support vector plane"