[Math] How to map quadratic programming formulation to dual soft margin SVM

machine learningoptimizationquadratic programming

I am trying to use quadratic programming for SVM and I am confused about how to map SVM formulation to quadratic programming formulation given in CVXOPT (Python package).
This is what CVXOPT gives us

This is what I need to map (dual soft margin support vector machine formulation):

Can someone explain how those two map..?
Thank you

Best Answer

The CVXOPT software solves quadratic programs of the form $$ \begin{array}{rl} \min\ & \frac{1}{2}x^TPx+q^Tx \\ \text{s.t.}\ & Gx\leq h \\ & Ax=b \end{array} $$

The problem you wish to solve is $$ \begin{array}{rl} \max\ & \sum_i\alpha_i-\frac{1}{2}\big\|\sum_i\alpha_iy_ix_i\big\|^2 \\ \text{s.t.} & -\alpha_i\leq0\text{ for all }i \\ & \alpha_i\leq{C}\text{ for all }i \\ & \sum_iy_i\alpha_i=0 \end{array} $$

I'm assuming here $x_i\in\mathbb{R}^d$ are vectors and $y_i\in\mathbb{R}$ are scalars for $i=1,\dots,n$.

Objective Function

Let's look at the expression

$$\bigg\|\sum_i\alpha_iy_ix_i\bigg\|^2.$$

Let's define the vector $z_i=y_ix_i$ for all $i=1,\dots,n$. Define the matrix $Z\in\mathbb{R}^{d\times{n}}$ to be the matrix whose $i^\text{th}$ column is $z_i$. Then the expression above can be written as

$$ \|Z\alpha\|^2=\alpha^TZ^TZ\alpha $$

where $\alpha\in\mathbb{R}^n$ is the vector of $\alpha_i$ variables, and we are using the two norm.

Hence our objective function is

$$\max\ \sum_i\alpha_i-\frac{1}{2}\bigg\|\sum_i\alpha_iy_ix_i\bigg\|^2=\max\ -\frac{1}{2}\alpha^TZ^TZ\alpha+\mathbf{1}^T\alpha=\min\ \frac{1}{2}\alpha^TZ^TZ\alpha-\mathbf{1}^T\alpha$$

where $\mathbf{1}$ is the vector of all ones. This is exactly the form required by CVXOPT: take $P=Z^TZ$ and $q=-\mathbf{1}$.

Inequality Constraints

The constraints $-\alpha_i\leq0$ can be written as

$$ -I\alpha\leq\mathbf{0} $$

where $I$ is the $n\times{n}$ identity matrix, and $\mathbf{0}$ is the zero vector. Similarly, the constraints $\alpha_i\leq{C}$ can be written in matrix form as

$$ I\alpha\leq C\mathbf{1}. $$

If we stack these two matrix systems on top of each other, then our inequality constraints can be summarized as $$ \begin{bmatrix}-I\\I\end{bmatrix}\alpha\leq\begin{bmatrix}\mathbf{0}\\C\mathbf{1}\end{bmatrix}. $$

This is the form required by CVXOPT: take

$$ G=\begin{bmatrix}-I\\I\end{bmatrix},\ h=\begin{bmatrix}\mathbf{0}\\C\mathbf{1}\end{bmatrix}. $$

Equality Constraints

I think you can take it from here.

Once you have all these matrices and vectors created, just pass them to solvers.qp() and you should be good to go.

Best Answer

Related Solutions

[Math] How to compute the offset parameter for an SVM from the dual solution

[Math] How to solve the dual problem of SVM

Related Question