Solved – Derivation of Support Vector Machine

constraintoptimizationsvm

I actually understood the derivation behind support Vector Machine but I have a doubt about constraint equation.
enter image description here

Why we have a constraint equation $\geq1$ if $y_i=1$ and $\leq-1$ if $y_i=-1$?
Can we have any arbitrary constant instead of 1? If no, then what is rational behind having this particular value?

Any help is highly appreciated.

Best Answer

Yes, you can have any arbitrary, strictly positive constant instead of 1.

Why? First some background.

Math and separating hyperplane:

Support vector machines attempts to find a separating hyper-plane between sets $X$ and $Y$. Mathematically, the condition for a separating hyperplane is:

$$ \boldsymbol{w} \cdot \boldsymbol{x}_i - b < 0 \quad \quad \boldsymbol{w} \cdot \boldsymbol{y}_i - b > 0 $$

Observe that the inequalities are strict!

Numerical issues and practical solution:

Numerically, this formulation has practical problems. If the inequalities aren't strict, $\boldsymbol{w} = \boldsymbol{0}, b = 0$ is a trivial solution. Numerical optimization routines may give bizarre answers to this problem; standard floating point math isn't infinitely precise etc...

What to do? Let's replace the strict inequalities with non-strict inequalities plus some separation constant $t>0$: $$ \boldsymbol{w} \cdot \boldsymbol{x}_i - b \leq -t \quad \quad \boldsymbol{w} \cdot \boldsymbol{y}_i - b \geq t $$ Yay! Numerical optimization can handle this. Also observe that since $\boldsymbol{w}$ and $b$ are choices variables, the scale of $t$ really doesn't matter. It's totally arbitrary. So we can just make it simple for ourselves and choose 1. (You could even choose different positive values for both inequalities; it doesn't matter.)

$$ \boldsymbol{w} \cdot \boldsymbol{x}_i - b \leq -1 \quad \quad \boldsymbol{w} \cdot \boldsymbol{y}_i - b \geq 1 $$

Other interpretation:

As your text explains, another interepretation of this is that you're fitting two parallel hyperplanes, one touching the X set, one touching the Y set, with some distance between them.

Related Question