Solved – Support Vector Machine Soft Margin

svm

In the soft margin SVM; can someone please give me the intuition of why a high value of the the penalty parameter $C$ causes the SVM to tend towards a hard margin SVM? I am failing to see the logic behind this.

Looking at the objective function of the soft margin SVM: $L_P=\frac{1}{2}||w||^2+C \cdot \Sigma_{i=1}^N{\xi}_i$, I can see that a small value of $C$ (such as one that tends to zero) would mean that we have $L_P=\frac{1}{2}||w||^2+0$, which is the same as the hard margin SVM. Unfortunately this is not the case–a high value of $C$ is what corresponds to a hard margin SVM. Can someone please give me a gentle insight into this concept?

Best Answer

$ξ_i$ is like the distance you're going to allow the ith point to fall inside the margin. If it is 0, you're not allowing it in the margin. If it is positive you're allowing it to fall inside the margin somewhat. (And no $ξ_i$ are negative by definition.)

If you allow the ith point to fall in the margin, you must penalize the objective function for this - specifically by $Cξ_i$. If C is very large, $ξ_i$ will likely be set to 0 in the optimization process because the decrease in $L_P$ by letting the ith point fall in the margin (which lets $||w||$ become small) is overpowered by the increase in $L_P$ from letting $ξ_i$ get large.

So large C corresponds to hard margin.

Best Answer

Related Solutions

SVM – Understanding the Loss Function of Hard Margin Support Vector Machines

Solved – SVM cost function: old and new definitions

Related Question