[Math] Dual of a semidefinite program

convex optimizationconvex-analysisoptimizationreference-requestsemidefinite-programming

How do I write the dual of the following semidefinite program?

\begin{align}
\max_{\lambda,y_i}~&\lambda \\
&\sum_{i=1}^{L}y_i\mathbf{C}_i-\lambda\mathbf{I}\geq 0 \\
&\sum_{i=1}^{L}y_i=1 \\
&y_i\geq 0
\end{align}

EDIT: This is not a homework. This comes out of a engineering problem. Note that the standard form of a SDP problem can be obtained if one omits the last two constraints. If someone can provide a theoretical process to tackle this problem, then that also will be helpful. I mean, how do I start deriving a dual for a SDP problem? Can you point me to some references?

Best Answer

I'm sure the link @AC_MOSEK offers here will give you the instructions you need. I'm also fond of Optimization by Vector Space Methods by David Luenberger. It's a bit dated in some respects, but its treatment of Lagrange multipliers is particularly well suited for conic and semidefinite programming, in my view.

But what the heck, I'll do it for you right here.

But before I do, would you mind if I ask why you want the dual? If it is purely for the intellectual pursuit, or to help you understand your problem better, then that's great; those are very good reasons. But in my view---and I suspect I disagree with my friends at MOSEK here---it is not something your average optimizer should be doing to help them solve an SDP model. That's a task best left to the solver and/or modeling framework you're using. After all, the conversion process is tedious and error-prone; that makes it ideal for automation. If you make one little mistake, your "dual" will no longer have a useful relationship to the primal; and it won't necessarily be obvious why, until the results just look odd. And what about models that sit "in between" the primal and dual standard forms? It's not always clear which form is the best choice to convert to. So let the system take responsibility for that.

Unfortunately, most bare solvers for SDP out there today do not do the conversion for you. That's a shame, in my opinion. However, SDP-aware modeling frameworks like CVX (disclaimer: mine!), YALMIP, or MOSEK's own Fusion system, are capable of it. (Am I right about Fusion, @AC_MOSEK?) CVX doesn't even ask you permission, it just picks whatever is best for the solver!

Rant over. (Not really a rant, an appeal.) On to the problem.

The first thing to do is form the Lagrangian. Most students of constrained optimization will know what to do for linear equations and inequalities; the challenge is to get it right for the SDP constraint. A Lagrange multiplier for an SDP constraint is itself a positive semidefinite matrix $Z$. The corresponding term in the Lagrangian uses use the standard inner product for symmetric matrices $\langle X,Z \rangle\triangleq \sum_{ij} \bar{X}_{ij} Z_{ij}$ ($\bar{X}_{ij}$ refers to the convex conjugate, which you can ignore if you're dealing with a real problem, of course). When both $X$ and $Z$ are positive semidefinite, this inner product is nonnegative.

To prevent confusion, I always assume the primal is a minimization, so I have to negate your objective function to $-\lambda$ here. I hope you will agree this is not a big deal; and we'll reverse that at the end of the process, so we end up with a minimization, like we expect. The Lagrangian is

$$\textstyle L(\lambda,y,Z,v,w) = -\lambda - \langle Z, \sum_i y_i C_I -\lambda I \rangle - \langle v, 1 - \sum_i y_i\rangle - \langle w,y \rangle$$ In addition to our Lagrange multiplier $Z$ for the SDP constraint, we have a multiplier $v$ for the equation, and $w\in\mathbb{R}^L_+$ for the linear inequalities. (A rule of thumb I use to get the signs right for inequalities: make sure you're subtracting a nonnegative quantity.)

The dual function is defined as $g(Z,v,w)=\inf_{\lambda,y} L(\lambda,y,Z,v,w)$. If you take the derivative with respect to $\lambda$ and $y$, you get this: $$-1+\langle Z,I \rangle = 0 \quad\Longrightarrow\quad \mathop{\textrm{Tr}}(Z) = 1$$ $$-\langle Z,C_i\rangle + v - w_i = 0 \quad\Longrightarrow\quad v = \langle Z,C_i \rangle + w_i = 0, \quad i=1,2,\dots,L$$ Note that these equations don't involve $\lambda$ and $y$ at all. What this means is that $g(Z,v,w)$ is unbounded below unless these particular conditions are met. When they are met, the only term that does not cancel out is $-v$. So your dual function is $$g(Z,v,w) = \begin{cases} -\sum_i v_i & \text{conditions above are met} \\ -\infty & \text{otherwise} \end{cases}$$ The dual problem in its raw form is $$\begin{array}{ll} \text{maximize} & g(Z,v,w) \\ \text{subject to} & Z\succeq 0 \\ & w\geq 0 \end{array}$$ Since we negated our primal problem to begin with, let's correct that here and turn the dual into a minimization: $$\begin{array}{ll} \text{minimize} & -g(Z,v,w) \\ \text{subject to} & Z\succeq 0 \\ & w\geq 0 \end{array}$$ Technically, this is the dual, and we could stop here. In practice, we never stop here, because $g$ is this weird function with implicit domain constraints. It's hard to have an intuitive sense of the dual in this form. Let's make those constraints explicit, leaving us with a linear objective. $$\begin{array}{ll} \text{minimize} & \textstyle v \\ \text{subject to} & \mathop{\textrm{Tr}}(Z) = 1 \\ & v = \langle Z,C_i \rangle + w_i ~~ i=1,2,\dots,L \\ & Z \succeq 0 \\ & w\geq 0 \end{array}$$ Now, I can eliminate $w$ rather simply by converting the $L$ equations to inequalities: $$\begin{array}{ll} \text{minimize} & v \\ \text{subject to} & \mathop{\textrm{Tr}}(Z) = 1 \\ & v \geq \langle Z,C_i \rangle ~~ i=1,2,\dots,L \\ & Z \succeq 0\end{array}$$ If you really want to get ambitious, and you're willing to accept a nonlinear objective, you can eliminate $v$ as well. Convince yourself that this is it: $$\begin{array}{ll} \text{minimize} & \textstyle \max_{i=1,2,\dots,L} \langle Z, C_i \rangle \\ \text{subject to} & \mathop{\textrm{Tr}}(Z) = 1 \\ & Z \succeq 0\end{array}$$ Of course, most solvers require linear objectives and constraints, but this one is in my view more informative. And it's the one I'd enter into CVX, if I were doing it... EDIT: in fact, your primal problem has a similar nonlinear equivalent: $$\begin{array}{ll} \text{maximize} & \lambda_{\min}(\textstyle \sum_i y_i C_i) \\ \text{subject to} & \textstyle \sum_i y_i= 1 \\ & y\geq 0 \end{array}$$And of course, $\mathop{\textrm{Tr}}(Z)=\sum_i\lambda_i(Z)$, so both problems have an eigenvalue interpretation.

Related Question