Optimization – Maximizing Sum of Vector Norms

Given matrices $A, B \in \mathbb{R}^{n\times n}$, I would like to solve the following optimization problem,

$$\begin{array}{ll} \underset{v \in \mathbb{R}^n}{\text{maximize}} & \|Av\|_2+\|Bv\|_2\\ \text{subject to} & \|v\|_2 = 1\end{array}$$

I'm hoping to solve this with some sort of convex optimization approach, such as SDP or SOCP. When $B=0$, the problem reduces to simply asking for the largest singular value of $A$, which can be solved by an SDP (see here for instance although it's very well known). It can also be computed with SVD of course.

I've tried lots of different approaches but can't seem to write it in a convex way. I would be okay with an SVD-like solution, that is, one that is iterative not a convex program, but I would much prefer a convex program because ultimately I would like to use this as an inner program to something else, ideally.

As a note on one attempt that got a decent ways, I did manage to establish that it was equivalent to the problem,

$$\begin{array}{ll} \underset{u, v \in \mathbb{R}^n}{\text{maximize}} & \|A^T u + B^T v\|_2\\ \text{subject to} & \|u\|_2 = \|v\|_2 = 1\end{array}$$

Best Answer

I have no doubt that someone will come with some brighter idea but here are my 2 cents anyway.

If you don't aim at something very fast, I would just use the inequality $(a+b)^2\le (ta^2+(1-t)b^2)(t^{-1}+(1-t)^{-1})=F_t(a,b)$ and try to find $\min_{t\in[0,1]}\max_v F_t(\|Av\|,\|Bv\|)$. The maximum inside is just $(t^{-1}+(1-t)^{-1})$ times the maximal eigenvalue of $tA^*A+(1-t)B^*B$ and the function $F_t$ is convex in $t$, so the maximum in $v$ is also convex, which makes simple one-dimensional tools for finding the minimum like bisection quite efficient.

The reason it will work is simple: the maximal unit eigenvector $v$ will move continuously (when you have a multiple eigenvalue, you can slowly move it from one limiting position to the other keeping the parameter $t$ constant), so there will be a moment in that process where $\|Av\|/\|Bv\|=(1-t)/t$. At that moment the inequality in the Cauchy-Schwarz will become equality, so the sum of the norms will achieve the upper bound coming from the quadratic relaxation (I assume that $A,B$ are both non-degenerate in this argument). Thus, the minimax we created really equals the maximin.

Note that it will all work nicely if you want just the value of the maximum, not the maximizing vector itself. The difficulty with the latter is that you may achieve the minimum in $t$ where the maximal eigenvalue is multiple, in which case not every eigenvector will be good but only the one for which the ratio of the norms is just right. Fortunately, generically it should not happen. However, you should be ready for this nuisance.

Best Answer

Related Solutions

Related Question