Convert SVM into convex QP

convex optimizationoptimizationquadratic programming

I have the following minimization:

$$\min\limits_{\mathbf{x}}\frac{1}{2}\|\mathbf{x}\|^{2}_{2}+\frac{1}{n}\sum_{i=1}^{n}\max(0,1-\mathbf{x}^{T}\mathbf{a}_{i})$$

The problem asks me to show that it is convex, which is fine and I did without any problems, and then it asks me to rewrite this problem as a convex QP, in order to find its Lagrangian and KKT conditions, The last step of the exercise is to show that the dual problem can also be written as a QP, but with box constraints.

Here are the steps I took in order to convert the primal as a QP:

Let us define $p_i=\max(0,1-\mathbf{x}^{T}\mathbf{a}_{i})$, which can be translated into the constraints:

$$\begin{align}
-p_i&\leq0\\
1-p_i&\leq\mathbf{x}^{T}\mathbf{a}_{i}
\end{align}$$

$$\begin{align}
-\mathbf{p}&\leq\mathbf{0}\\
-\mathbf{p}-\mathbf{A}\mathbf{x}&\leq\mathbf{1}
\end{align}$$

And define the vector $\mathbf{z}=\left[\mathbf{x} \quad \mathbf{p}\right]^{T}$

Then we can rewrite the problem as:

\begin{align}
&\min\limits_{\mathbf{z}}\frac{1}{2}\mathbf{z}^{T}\mathbf{P}\mathbf{z}+\mathbf{q}^{T}\mathbf{z}\\
&\text{ s.t. }\quad\mathbf{C}\mathbf{z}\leq \mathbf{b}
\end{align}

Where

$$\mathbf{P}=\begin{bmatrix}\mathbf{I} & \mathbf{0}\\
\mathbf{0} & \mathbf{0}\end{bmatrix},$$

$$\mathbf{q}=\begin{bmatrix}\mathbf{0}\\
\frac{1}{n}\mathbf{1}\end{bmatrix},$$

$$\mathbf{C}=\begin{bmatrix}\mathbf{0} & -\mathbf{I}\\
-\mathbf{A} & -\mathbf{I}\end{bmatrix},$$

and

$$\mathbf{b}=\begin{bmatrix}\mathbf{0}\\
-\mathbf{1}\end{bmatrix}.$$

The constraint is convex, but $\nabla_{z}^{2}f(\mathbf{z})=\mathbf{P}\succeq0$, which is only positive semi-definite. However, in order for my QP to be convex, I would need $\nabla_{z}^{2}f(\mathbf{z})=\mathbf{P}\succ0$, right?

What did I do wrong? Is there another way to formulate this problem in order to achieve the positive definiteness of my Hessian?

Best Answer

A function is convex if its Hessian is positive semidefinite, and strictly convex if its Hessian is positive definite. Your problem is strictly convex in $x$, but linear (and hence, convex) in $p$.

The second block element of $\mathbf{b}$ should be $-\mathbf{1}$.

Related Solutions

Optimization – Correct Dual to Given Primal Linear Programming Problem

Maybe it's worthwhile to talk through where the dual comes from. This will take a while, but hopefully the dual won't seem so mysterious when we're done.

Suppose we want to use the primal's constraints as a way to find an upper bound on the optimal value of the primal. If we multiply the first constraint by $9$, the second constraint by $1$, and add them together, we get $9(2x_1 - x_2) + 1(x_1 +3 x_2)$ for the left-hand side and $9(1) + 1(9)$ for the right-hand side. Since the first constraint is an equality and the second is an inequality, this implies $$19x_1 - 6x_2 \leq 18.$$ But since $x_1 \geq 0$, it's also true that $5x_1 \leq 19x_1$, and so $$5x_1 - 6x_2 \leq 19x_1 - 6x_2 \leq 18.$$ Therefore, $18$ is an upper-bound on the optimal value of the primal problem.

Surely we can do better than that, though. Instead of just guessing $9$ and $1$ as the multipliers, let's let them be variables. Thus we're looking for multipliers $y_1$ and $y_2$ to force $$5x_1 - 6x_2 \leq y_1(2x_1-x_2) + y_2(x_1 + 3x_2) \leq y_1(1) + y_2(9).$$

Now, in order for this pair of inequalities to hold, what has to be true about $y_1$ and $y_2$? Let's take the two inequalities one at a time.

The first inequality: $5x_1 - 6x_2 \leq y_1(2x_1-x_2) + y_2(x_1 + 3x_2)$

We have to track the coefficients of the $x_1$ and $x_2$ variables separately. First, we need the total $x_1$ coefficient on the right-hand side to be at least $5$. Getting exactly $5$ would be great, but since $x_1 \geq 0$, anything larger than $5$ would also satisfy the inequality for $x_1$. Mathematically speaking, this means that we need $2y_1 + y_2 \geq 5$.

On the other hand, to ensure the inequality for the $x_2$ variable we need the total $x_2$ coefficient on the right-hand side to be exactly $-6$. Since $x_2$ could be positive, we can't go lower than $-6$, and since $x_2$ could be negative, we can't go higher than $-6$ (as the negative value for $x_2$ would flip the direction of the inequality). So for the first inequality to work for the $x_2$ variable, we've got to have $-y_1 + 3y_2 = -6$.

The second inequality: $y_1(2x_1-x_2) + y_2(x_1 + 3x_2) \leq y_1(1) + y_2(9)$

Here we have to track the $y_1$ and $y_2$ variables separately. The $y_1$ variables come from the first constraint, which is an equality constraint. It doesn't matter if $y_1$ is positive or negative, the equality constraint still holds. Thus $y_1$ is unrestricted in sign. However, the $y_2$ variable comes from the second constraint, which is a less-than-or-equal to constraint. If we were to multiply the second constraint by a negative number that would flip its direction and change it to a greater-than-or-equal constraint. To keep with our goal of upper-bounding the primal objective, we can't let that happen. So the $y_2$ variable can't be negative. Thus we must have $y_2 \geq 0$.

Finally, we want to make the right-hand side of the second inequality as small as possible, as we want the tightest upper-bound possible on the primal objective. So we want to minimize $y_1 + 9y_2$.

Putting all of these restrictions on $y_1$ and $y_2$ together we find that the problem of using the primal's constraints to find the best upper-bound on the optimal primal objective entails solving the following linear program:

$$\begin{align*} \text{Minimize }\:\:\:\:\: y_1 + 9y_2& \\ \text{subject to }\:\:\:\:\: 2y_1 + y_2& \geq 5 \\ -y_1 + 3y_2& = -6\\ y_2 & \geq 0. \end{align*}$$

And that's the dual.

It's probably worth summarizing the implications of this argument for all possible forms of the primal and dual. The following table is taken from p. 214 of Introduction to Operations Research, 8th edition, by Hillier and Lieberman. They refer to this as the SOB method, where SOB stands for Sensible, Odd, or Bizarre, depending on how likely one would find that particular constraint or variable restriction in a maximization or minimization problem.

             Primal Problem                           Dual Problem
             (or Dual Problem)                        (or Primal Problem)

             Maximization                             Minimization

Sensible     <= constraint            paired with     nonnegative variable
Odd          =  constraint            paired with     unconstrained variable
Bizarre      >= constraint            paired with     nonpositive variable

Sensible     nonnegative variable     paired with     >= constraint
Odd          unconstrained variable   paired with     = constraint
Bizarre      nonpositive variable     paired with     <= constraint

[Math] Convert a non-convex QCQP into a convex counterpart

I finally came up with a solution, which seems to be a valid reformulation.

Expand $f_i(x) \ (i=0,1,\cdots,m)$ into the form (form clarity, I drop the subscript in $f_i(x), P_i$ and $q_i$. $$f(x)=\frac{1}{2}\sum_i^{n}P_{ii}x_i^2+\sum_{i\neq j}P_{ij}x_i x_j+ \sum_i^n q_i x_i + r$$

Notice that $x \geq 0$, apply the change of variable $y_i=x_i^2$, we have

$$f(y)=\frac{1}{2}\sum_i^{n}P_{ii}y_i+\sum_{i\neq j}P_{ij}\sqrt{y_i y_j}+ \sum_i^n q_i\sqrt{y_i} + r$$

The first term $\sum_i^{n}P_{ii}y_i$ is affine in $y_i$. What is more, since the off-diagonal entries of $P$ are non-positive and this makes the second term convex. Similarly, the last term $\sum_i^n q_i\sqrt{y_i} + r$ is also convex. Then $f(y)$ is convex.

The "hint" directly from the problem is provided by $x\geq 0$, what indicates the possibility of taking the square root.

Best Answer

Related Solutions

Optimization – Correct Dual to Given Primal Linear Programming Problem

[Math] Convert a non-convex QCQP into a convex counterpart

Related Question