[Math] Derive the dual of the semidefinite program $\max_Y{\rm Tr}(X^T Y)$ subject to $\begin{pmatrix}I&Y\\ Y^T&I\end{pmatrix}\succeq0$

convex optimizationconvex-analysisduality-theoremsreal-analysissemidefinite-programming

I have a semidefinite program (SDP) here, which is equivalent to the trace norm,

\begin{equation}
\begin{array}{ll}
\max_{Y \in \mathbb{R}^{m \times n}} & \text{trace}(X^T Y) \\
\text{subject to} &
\left[
\begin{array}{cc}
I_m & Y \\
Y^T & I_n
\end{array}
\right]
\succeq 0,
\end{array}
\label{eq:aa:primal}
\end{equation}

Now I want to derive its dual in a form,

\begin{equation}
\begin{array}{ll}
\min_{\substack{W_{1} \in \mathbb{S}^{m}, \\ W_{2} \in \mathbb{S}^{n}}} & \text{trace}(W_{1}) + \text{trace}(W_{2}) \\
\text{subject to} &
\left[
\begin{array}{cc}
W_{1} & (1/2) X \\
(1/2) X^T & W_{2}
\end{array}
\right]
\succeq 0,
\end{array}
\label{eq:aa:dual}
\end{equation}

How am I supposed to introduce the dual variables here?

Best Answer

EDIT: I mistakenly misread the original problem as a minimization, not a maximization, so this is wrong. For one thing, the dual is indeed a minimization. But there will also be some sign differences as well. My apologies to the OP, and feel free to remove your upvotes and/or downvote.

Let me go ahead and write out my suggestion. As I mentioned, the Lagrange multiplier for any SDP constraint $Q \succeq 0$ is a PSD matrix of the same size. Thus the Lagrangian is $$L(Y,Z) = \langle X, Y \rangle - \left\langle Z,\begin{bmatrix} I_m & Y \\ Y^T & I_n \end{bmatrix} \right\rangle$$ where $Z\in\mathcal{S}^{m+n}_+$. (I'm using the definition $\langle A,B \rangle = \mathop{\textrm{Tr}}(A^TB)$ here.) To facilitate simplification, we express $Z$ in a block form: $$Z \triangleq \begin{bmatrix} Z_{11} & Z_{12} \\ Z_{12}^T & Z_{22} \end{bmatrix}$$ $$\begin{aligned} L(Y,Z) &= \langle X, Y \rangle - \left\langle \begin{bmatrix} Z_{11} & Z_{12} \\ Z_{12}^T & Z_{22} \end{bmatrix}, \begin{bmatrix} I_m & Y \\ Y^T & I_n \end{bmatrix} \right\rangle\\ &= \langle X, Y \rangle - \langle Z_{11}, I_m \rangle - \langle Z_{22}, I_n \rangle - 2 \langle Z_{12}, Y \rangle \\ &= \langle X - 2 Z_{12}, Y \rangle - \langle Z_{11}, I_m \rangle - \langle Z_{22}, I_n\rangle \end{aligned}$$ The dual function is therefore $$g(Z) = \inf_Y L(Y,Z) = \begin{cases} - \langle Z_{11}, I_m \rangle - \langle Z_{22}, I_n \rangle & X - 2 Z_{12} = 0 \\ -\infty & X - 2 Z_{12} \neq 0 \end{cases}$$ The $-\infty$ arises from the fact that if $X\neq 2Z_{12}$, that linear expression is unbounded below. So the only way you get a bounded value for $g(Z)$ is if that first term is identically zero. (For nonlinear dual functions, you would do a more traditional minimization; e.g., by differentiating with respect to $Y$.)

Given this dual function, the dual problem is \begin{array}{ll} \text{maximize} & g(Z) \\ \text{subject to} & Z \succeq 0 \end{array} Technically, this is the true Lagrange dual. But in practice we move the implicit domain constraints out of the dual function. \begin{array}{ll} \text{maximize} & - \langle Z_{11}, I_m \rangle - \langle Z_{22}, I_n \rangle \\ \text{subject to} & 2 Z_{12} = X \\ & Z \succeq 0 \end{array} Eliminating $Z_{12}$ we obtain \begin{array}{ll} \text{maximize} & - \langle Z_{11}, I_m \rangle - \langle Z_{22}, I_n \rangle = -\mathop{\textrm{Tr}}(Z_{11}) - \mathop{\textrm{Tr}}(Z_{22}) \\ \text{subject to} & \begin{bmatrix} Z_{11} & (1/2) X \\ (1/2) X^T & Z_{22} \end{bmatrix} \succeq 0 \end{array} Of course, we have $W_1 \rightarrow Z_{11}$ and $W_2 \rightarrow Z_{22}$. And the true dual is a maximization, not a minimization as you have written it, so flip yours to a maximization (negating the objective) and we're done.

Related Question