Using Schur’s Complement to Reformulate Semidefinite Programming Constraint

convex optimizationlinear algebraschur-complementsemidefinite-programming

I am currently reading Boyd & Vandenberghe's Convex Optimization and encountered (at least twice) reformulations of matrix inequalities that confuses me a lot. I will try to clarify my question and would be very grateful for whoever can resolve my problem. By the way, I strongly believe that my problem is fundamentaly related to Schur's Complement.

Consider $A_i \in \mathbf{S}^n_{++}, B \in \mathbf{S}^n_+$ and $c_i, d,\lambda_i \in \mathbb{R}$. Why we have
$$
\begin{bmatrix}
-\lambda_i – d^TA_id – 2b_i^Td – c_i & (A_id+b_i)^TB \\
B(A_id+b_i) & \lambda_iI-BA_iB
\end{bmatrix} \succeq 0 \Leftrightarrow
\begin{bmatrix}
-\lambda_i – c_i + b_i^TA_i^{-1}b_i & 0 & (d+A_i^{-1}b_i)^T \\
0 & \lambda_iI & B \\
d+A_i^{-1}b_i & B & A_i^{-1}
\end{bmatrix} \succeq 0
$$

This equivalence arises on page 415 of the book, and itself is a constraint of a semidefinite programming problem intending to find the maximum volume inscribed ellipsoid in an intersection of ellipsoids. I have one observation that I believe is very useful, that,

the (2,2)-block of the first $2 \times 2$ block matrix above is the Schur's complement of the submatrix (obtained from eliminating first block row and first block column) of the second $3 \times 3$ block matrix above.

As another example, consider $P \in \mathbb{R}^{m \times p}$, $q \in \mathbb{R}^m$ and $t, \lambda \in \mathbb{R}$. Why we have
$$
\begin{bmatrix}
\lambda I – P^TP & P^Tq \\
q^TP & t
\end{bmatrix} \succeq 0 \Leftrightarrow
\begin{bmatrix}
I & P & q \\
P^T & \lambda I & 0 \\
q^T & 0 & I
\end{bmatrix} \succeq 0
$$

This equivalence arises on page 324 of the book, and is itself a constraint of the robust least-squares problem equipped with Euclidean norm. When using duality to formulate a dual problem, the inequality on the left hand side arises, and the author (and the original paper ) seems to suggest the equivalence of it with the inequality on the right hand side. Again,

the (1,1)-block of the first $2 \times 2$ block matrix is again the Schur's complement of the submatrix (obtained from eliminating the last block row and last block column) of the second $3 \times 3$ block matrix.

I believe a universal trick to manipulate these matrix inequalities using Schur's complement exist here, but that does not seem very trivial to me. I would be very grateful if anyone can elaborate on this for me!

Best Answer

First Equivalence: To show the equivalence, we will use Schur's complement to decompose this matrix. The Schur's complement of the bottom-right block in the matrix above is given by: Original Equation:

\begin{bmatrix} -\lambda_i - d^TA_id - 2b_i^Td - c_i & (A_id+b_i)^TB \\ B(A_id+b_i) & \lambda_iI-BA_iB \end{bmatrix} $\succeq 0$

To show the equivalence, we will use Schur's complement to decompose this matrix. The Schur's complement of the bottom-right block in the matrix above is given by:

$S_i =λ_i I−BA_i ​B$

Equivalence using Schur's Complement:

\begin{bmatrix} -\lambda_i - c_i + b_i^TA_i^{-1}b_i & 0 & (d+A_i^{-1}b_i)^T \\ 0 & \lambda_iI & B \\ d+A_i^{-1}b_i & B & A_i^{-1} \end{bmatrix} $\succeq 0$

The key step here is to recognize that the (1,1)-block of the first 2x2 block matrix is the Schur's complement of the submatrix (obtained from eliminating the last block row and last block column) of the second 3x3 block matrix.

Second Equivalence:

\begin{bmatrix} \lambda I - P^TP & P^Tq \\ q^TP & t \end{bmatrix} $\succeq 0$ \begin{bmatrix} \lambda I - P^TP & P^Tq \\ q^TP & t \end{bmatrix} $\succeq 0$

Now, we'll use Schur's complement to decompose this matrix. The Schur's complement of the top-left block in the matrix above is given by: $S=λI−P TP$

Original Equation:

\begin{bmatrix} \lambda I - P^TP & P^Tq \\ q^TP & t \end{bmatrix} $\succeq 0$

Equivalence using Schur's Complement:

\begin{bmatrix} I & P & q \\ P^T & \lambda I & 0 \\ q^T & 0 & I \end{bmatrix} $\succeq 0$

Once again, notice that the (1,1)-block of the first 2x2 block matrix is the Schur's complement of the submatrix (obtained from eliminating the last block row and last block column) of the second 3x3 block matrix.

In both cases, these equivalences are achieved by recognizing the Schur's complement of a submatrix within a larger matrix. This technique is commonly used to manipulate matrix inequalities in convex optimization problems, and it simplifies the analysis and formulation of such problems.

Related Question