Solved – How to calculate decision boundary from support vectors

machine learningsvm

I want to obtain decision boundary of SVM using OpenCV 2.4.11, but it seems that it's not returning it explicitly, but only support vectors.

How we can calculate decision boundary from support vectors?

Update:

Anyway seems via opencv interface I can't get lagrange multiplier $\alpha_i$

Best Answer

A binary SVM tries to separate subjects belonging to one of two classes based on some features, the class will be denoted as $y_i \in \{-1,+1\}$ and the features as $x_i$, note the $y_i$ is a single number while $x_i$ is a vector. The index $i$ identifies the subject.

The SVM solves a quadratic programming problem and finds, for each subject a lagrange multiplier $\alpha_i$, many of the $\alpha$'s are zero. Note that the $\alpha_i$ are numbers, so for each subject $i$ you have a number $y_i$, a feature vector $x_i$ and a lagrange multiplier $\alpha_i$ (a number).

You have also choosen a kernel $K(x,y)$ ($x$ and $y$ are vectors) for which you know the functional form and you have choosen a capacity parameter $C$.

The $x_i$ for which the corresponding $\alpha_i$ are non-zero are the support vectors.

To compute your decision boundary, I refer to this article. Formula (61) from the mentioned article learns that the decision boundary has the equation $f(x)=0$, where $f(x)=\sum_i \alpha_i y_i K(x_i, x) + b$ and as the $\alpha_i$ are only non-zero for the support vectors, this becomes (SV is the set of support vectors): $f(x)=\sum_{i \in SV} \alpha_i y_i K(s_i, x) + b$ (where I changed $x_i$ to $s_i$ as in formula (61) of the article, to indicate that only support vectors appear).

As you know all the support vectors, you know the (non-zero) $\alpha_i$, the corresponding (number) $y_i$ and the vectors $s_i$, you can compute this $f(x)$ if you know your kernel $K$ and the constant $b$.

So we still have to find $b$: The equations (55) and (56) of the article I referred to, learn that for an arbitrary $\alpha_i$ with $0 < \alpha_i < C$ (C is a parameter of your SVM) it holds that $y_i ( x_i \cdot w + b) = 1$ where $w$ is given by equation (46) of the article. So $b = \frac{1}{y_i} - x_i \cdot w=\frac{1}{y_i} - \sum_{k \in SV} \alpha_k y_k K(x_i, s_k)$. (the '$\cdot$' is the inner product that will later be replaced by the kernel, see article that I referred to).

The latter holds for any of the $\alpha_i, 0 < \alpha_i < C$, so if you choose one such an $\alpha_i$ that is smaller than $C$ and positive , then you can compute $b$ from the corresponding observation: take an $m$ for which $0 < \alpha_m < C$, with this $m$ there corresponds an $x_m$ and an $y_m$. On the other hand you know all the support vectors $s_k$ (which are the vectors $x_k$ whith non-zero $\alpha_k$ see above) with their corresponding $y_k$ and $\alpha_k$. With all these you can compute $f(x)$ and $f(x)=0$ identifies the decision boundary, the sign of $f(x)$ determines the class.

So to summarise:

  1. Take the subjects $k$ that correspond to the support vectors, i.e. for which the Lagrange multiplier $\alpha_k$ is not zero, for each such subject you have the Lagrange multiplier $\alpha_k$, the class $y_k \in \{-1,+1\}$, and the feature vector $x_k$ (denoted as $s_k$ to make it clear that it are support vectors);
  2. Take one subject for which the Lagrange multiplier is strictly smaller than $C$, and strictly positive, assume this is object $m$ and use it to compute $b$ as $b =\frac{1}{y_m} - \sum_{k \in SV} \alpha_k y_k K(x_m, s_k)$;
  3. The equation that identifies the decision boundary is $f(x)=0$ where $f(x)=\sum_{k \in SV} \alpha_k y_k K(s_k, x) + b$