Approximate a rank-1 solution after exploiting semidefinite relaxation method

convex optimizationmatricesmatrix-rankrelaxationssemidefinite-programming

For my problem, I try to optimize the vector $\mathbf{w}\in\mathbb{C}^{N}$ at first. After exploiting semidefinite relaxation (SDR) method, the variable becomes $\mathbf{W} = \mathbf{ww}^H$ and the considered convex problem can be effectively solved with a PSD solution, as the rank constraint is dropped (the variable is set to be PSD). Then, how can we get a rank-1 solution from the solution of PSD matrix (probably rank > 1) and get back to a feasible solution of $\mathbf{w}$?

One way I heard from others is that to directly check the rank of the solution. If it happens to be rank 1, then we just choose the corresponding eigenvector to be the unique solution of $\mathbf{w}$.

But if it’s not rank-1, what do we usually perform to approximate a rank-1 solution from the solved PSD, that satifies the object and the constraints as much as possible? Meanwhile, I read a paper (reference 1 in this post) that also addresses this problem. In this paper, they cited an another paper [20] (reference 2 in this post) and said

[…] the rank of the output of the relaxed solution may be high (>1). We use standard SDP methods to encourage low rank solutions [20] and then approximate a feasible rank one solution from the output of problem $\mathcal{P}_8$ [21]

After I read the paper [20] they cited, I found the paper [20] is mainly solving the rank minimization problem, but I barely see how these methods can be exploited while the object and constraints of the original problem are met. And for [21] (reference 3 in this post), it's just the linear algebra book from G.Strang, so I guess it's something like eigenvalue decomposition for this step maybe.

References

E. Aryafar and A. Keshavarz-Haddad, PAFD: Phased Array Full-Duplex [PDF], IEEE INFOCOM 2018 – IEEE Conference on Computer Communications, Honolulu, HI, USA, 2018, pp. 261-269.
M. Fazel, H. Hindi and S. Boyd, Rank minimization and applications in system theory [PDF], Proceedings of the 2004 American Control Conference, Boston, MA, USA, 2004, pp. 3273-3278 vol.4.
G. Strang, Introduction to Linear Algebra [PDF]. Wellesley, MA: Wellesley-Cambridge Press, 2009.

Best Answer

I know of two ways this is handled in practice, and unfortunately neither of them are super satisfying from a mathematical perspective.

The first option is to take the solution of the SDP-relaxation, then project its solution onto the (nonconvex) set of rank-one matrices. This is performed (via Eckart-Young theorem) by just taking rank-one matrix which corresponds to the dominant singular value-vector pair of the original solution, and dropping the rest. It can be shown that this rank-one matrix is the "closest" (under the Frobenius norm) rank-one matrix to the original solution of the SPD relaxation. However, as you have pointed out, this does not guarantee that the resulting rank-one matrix satisfies any hard constraint from the original SDP anymore. It might satisfy your other constraints by a lucky coincidence, but this is not guaranteed. Oftentimes, since the SDP is a relaxation anyways, a lot of articles breeze over this issue of potentially losing feasibility after performing the rank-one projection!

One potential remedy is to compute the projection of the SDP-relaxed solution onto the intersection of the set of rank-one matrices with your original constraints (if they are required for your desired solution). However, this is not a straightforward task and requires solving another optimization problem in its own right.

The other remedy is to (instead of solving an SDP relaxation) just directly parameterize the variables within a rank-one matrix and try to solve that optimization problem instead. By hard-coding the rank-one structure, you can sometimes circumvent the "nonconvexity" issues of the underlying optimization problem, leading to efficient solution techniques.

Best Answer

Related Solutions

[Math] SDP relaxation of non-convex QCQP and duality gap

[Math] Trace minimization subject to diagonal constraints

Related Question