Solved – Deriving Constraints in the dual form of SVM

classificationsvm

In a regular soft-margin SVM we want to find

$\min_{w,b, \xi} \frac{1}{2}||w||^2+C\sum_{i=1}^{\ell}\xi_i$

subject to $y_i((w,x_i),+b) \geq 1 – \xi_i$

and $\xi_i \geq 0$

Which we convert to the Lagrangian

$L(w, b, \alpha, \beta) = \frac{1}{2}||w||^2+C\sum_{i=1}^{\ell}\xi_i – \sum_{i=1}^{\ell}\alpha_i[y_i((w,x_i)+b)-1+\xi_i] – \sum_{i=1}^{\ell}\beta_i \xi_i$

To find the minimum with respect to $w,b,\xi$ we find where the gradient is the 0 vector, giving

$$
\begin{aligned}
\frac{\partial L}{\partial w} & = w – \sum_{i=1}^{\ell}\alpha_i y_i x_i = 0 \\
& \equiv w – \sum_{i=1}^{\ell}\alpha_i y_i x_i = 0 \\
& \equiv w = \sum_{i=1}^{\ell}\alpha_i y_i x_i \\
\end{aligned}$$

$$\begin{aligned}
\frac{\partial L}{\partial b} & = \sum_{i=1}^{\ell} -\alpha_i y_i = 0 \\
& \equiv \sum_{i=1}^{\ell} \alpha_i y_i = 0 \\
\end{aligned}$$

$$
\begin{aligned}
\frac{\partial L}{\partial \xi} & = \sum_{i=1}^{\ell}C-\sum_{i=1}^{\ell}\alpha_i-\sum_{i=1}^{\ell}\beta_i = 0\\
& \equiv \sum_{i=1}^{\ell}C = \sum_{i=1}^{\ell}\alpha_i+\beta_i \\
\end{aligned}$$

My question is this…. In every text I look at, (such as pg.8 of these teaching notes or implied in pg.20 of these course notes) the last equation ($\frac{\partial L}{\partial \xi}$) results in $C=\alpha_i + \beta_i$ without the summation. Can anybody explain to me how this leap is made?

Best Answer

Because ξ is a vector, and in the first link you will see they derive it per every element. more specifically: $$ \frac{\partial C}{\partial \zeta_i} = C - \alpha_i - \beta_i $$

It can be maybe better understood if we look at the Lagrangian again:

$ L(w,b,\alpha,\beta) = ... C\sum_i^l\zeta_i - \sum_i^l\alpha_i\zeta_i - \sum_i^l\alpha_i\zeta_i$

$L(w,b,\alpha,\beta) = ... \sum_i^l({C\zeta_i -\alpha_i\zeta_i - \beta_i\zeta_i )} = ...\sum_i^l{\zeta_i(C - \alpha_i - \beta_i )}$

And from here we can see that the derivative with respect to $\zeta_i$ is 0 only if $C - \alpha_i - \beta_i=0$

Best Answer

Related Solutions

Solved – Deriving the optimal value for the intercept term in SVM

Solved – Lagrangian dual of SVM: derivation

Related Question