I need to generate a Large Margin Classifier using python library cvxopt which allows me to solve the quadratic program.
I am trying to write a python function to take the training data and some test data and return the support vectors and the distance of each test data point from the optimal hyperplane.
I am struggling, however, to understand the output of the cvxopt and how to use it to find the support vectors and then the distances.
My function so far looks like this:
def lmc(X, t, Xtest):
numSamples = len(X)
P = matrix(outer(t,t) * inner(X,X))
q = matrix(ones(numSamples) * -1)
G = matrix(diag(ones(numSamples) * -1))
h = matrix(zeros(numSamples))
A = matrix(t, (1,numSamples))
b = matrix(0.0)
solution = solvers.qp(P, q, G, h, A, b)
return solution
The output of this function so far includes large vectors: s, x, z and a single value for y. My knowledge of support vector machines is pretty sketchy so any help would be greatly appreciated.
Best Answer
Using the notation and steps provided by Tristan Fletcher the general steps to solve the SVM problem are the following. You can also see the full article and code in python here:
Re-writing the problem in an appropriate format
Since we will solve this optimization problem using the CVXOPT library in python we will need to match the solver's API which, according to the documentation is of the form:
\begin{aligned} & \min \frac{1}{2} x^TPx + q^Tx \\ s.t. \ & \ Gx \leq h \\ & \ Ax = b \end{aligned}
With API
Recall that the dual problem is expressed as:
$$ \max_{\alpha} \sum_i^m \alpha_i - \frac{1}{2} \sum_{i,j}^m y^{(i)}y^{(j)} \alpha_i \alpha_j <x^{(i)} x^{(j)}> $$
Let $\mathbf{H}$ be a matrix such that $H_{i,j} = y^{(i)}y^{(j)} <x^{(i)} x^{(j)}> $, then the optimization becomes:
\begin{aligned} & \max_{\alpha} \sum_i^m \alpha_i - \frac{1}{2} \alpha^T \mathbf{H} \alpha \\ s.t. & \ \alpha_i \geq 0 \\ & \ \sum_i^m \alpha_i y^{(i)} = 0 \end{aligned}
We convert the sums into vector form and multiply both the objective and the constraint by $-1$ which turns this into a minimization problem and reverses the inequality
\begin{aligned} & \min_{\alpha} \frac{1}{2} \alpha^T \mathbf{H} \alpha - 1^T \alpha \\ & s.t. \ - \alpha_i \leq 0 \\ & s.t. \ y^T \alpha = 0 \end{aligned}
We are now ready to convert our numpy arrays into the cvxopt format, using the same notation as in the documentation this gives
Note that in the simple example of $m = 2$ the matrix $G$ and vector $h$ which define the constraint are
$G = \begin{bmatrix} -1 & 0 \\ 0 & -1 \end{bmatrix} $ and $h = \begin{bmatrix} 0 \\ 0 \end{bmatrix} $
Computing the matrix $\mathbf{H}$ in vectorized form
Consider the simple example with 2 input samples $\{x^{(1)}, x^{(2)}\} \in \mathbb{R}^2$ which are two dimensional vectors. i.e. $x^{(1)} = (x_1^{(1)} , x_2^{(1)})^T$
$$ X = \begin{bmatrix} x_1^{(1)} & x_2^{(1)} \\ x_1^{(2)} & x_2^{(2)} \end{bmatrix} \ \ \ y = \begin{bmatrix} y^{(1)} \\ y^{(2)} \end{bmatrix} $$
We now proceed to creating a new matrix $X'$ where each input sample $x$ is multiplied by the corresponding output label $y$. This can be done easily in Numpy using vectorization and padding.
$$X' = \begin{bmatrix} x^{(1)}_1 y^{(1)} & x^{(1)}_2y^{(1)} \\ x^{(2)}_1y^{(2)} & x^{(2)}_2y^{(2)} \end{bmatrix} $$
Finally we take the matrix multiplication of $X'$ and its transpose giving $H = X' X'^T$
$$H = X' @ X'^T = \begin{bmatrix} x^{(1)}_1 y^{(1)} & x^{(1)}_2y^{(1)} \\ x^{(2)}_1y^{(2)} & x^{(2)}_2y^{(2)} \end{bmatrix} \begin{bmatrix} x^{(1)}_1 y^{(1)} & x^{(2)}_1 y^{(2)} \\ x^{(1)}_2y^{(1)} & x^{(2)}_2y^{(2)} \end{bmatrix} $$
$$ H = \begin{bmatrix} x^{(1)}_1 x^{(1)}_1y^{(1)}y^{(1)} + x^{(1)}_2x^{(1)}_2y^{(1)}y^{(1)} & x^{(1)}_1 x^{(2)}_1y^{(1)}y^{(2)} + x^{(1)}_2x^{(2)}_2y^{(1)}y^{(2)} \\ x^{(2)}_1 x^{(1)}_1y^{(2)}y^{(1)} + x^{(2)}_2x^{(1)}_2y^{(2)}y^{(1)} & x^{(2)}_1 x^{(2)}_1y^{(2)}y^{(2)} + x^{(2)}_2x^{(2)}_2y^{(2)}y^{(2)} \end{bmatrix}$$
Implementation in Python
Toy data set
CVXOPT solver and resulting $\alpha$
Compute $w$ and $b$ parameters
Result
Source:
Full code here