Norm $\|x\|$ defined as the sum of largest $r$ absolute values, how to write $\min \|A x-b\|_{2}^{2}+\|x\|$ as QP

convex optimizationnonlinear optimizationquadratic programming

This is a question from Boyd Convex Optimization, Additional Exercise 5.31

In this problem, $r$ is an integer between 1 and $n$, and $\|x\|$ denotes the norm
$$
\|x\|=\max _{1 \leq i_{1}<\cdots<i_{r} \leq n}\left|x_{i_{1}}\right|+\cdots+\left|x_{i_{r}}\right|
$$

on $\mathbf{R}^{n}$ ( $\|x\|$ is the sum of the largest $r$ absolute values of entries of $x$ ). For $r=1$, this is the Chebyshev norm $\|x\|_{\infty}=\max _{i}\left|x_{i}\right|$; for $r=n$, it is the 1-norm $\|x\|_{1}=\sum_{k}\left|x_{k}\right|$.

(a) Explain why $\|x\|$ is the optimal value of the optimization problem
$$
\begin{array}{ll}
\operatorname{maximize} & x^{T} y \\
\text { subject to } & \|y\|_{\infty} \leq 1 \\
& \|y\|_{1} \leq r
\end{array}
$$

The variable in this problem is an $n$-vector $y$.

(b) From part (a),$-\|x\|$ is the optimal value of the convex optimization problem
$$
\begin{array}{ll}
\operatorname{minimize} & f(y) \\
\text { subject to } & \|y\|_{1} \leq r
\end{array}
$$

where $\operatorname{dom} f=\left\{y \mid\|y\|_{\infty} \leq 1\right\}$ and $f(y)=-x^{T} y$ for $y \in \operatorname{dom} f$.
Derive the dual of this problem (exactly as it is stated, i.e., treating the constraint $\|y\|_{\infty} \leq 1$ as an implicit constraint).

(c) Suppose $A \in \mathbf{R}^{m \times n}$ and $b \in \mathbf{R}^{m}$. Use your result in part (b) to formulate the problem
$$
\text { minimize }\|A x-b\|_{2}^{2}+\|x\|
$$

as a quadratic program.

I have worked out (a). For (b), my answer is
$$L(\lambda, y) = -x^Ty+\lambda(\|y\|_1-r)$$
\begin{align}
g(\lambda)&=\inf_{\|y\|_\infty\leq 1}L(\lambda, y)\\
&=-\lambda r + \sum_{i=1}^n (\lambda |y_i|-x_i y_i)\\
&=-\lambda r + \sum_{i=1}^n (\lambda-x_i) \mathbf{1}_{|x_i|\geq \lambda}
\end{align}

Suppose there are $k$ elements of $x$ that is larger than $\lambda$, than
$$g(\lambda)=(k-r)\lambda-\|x\|_{\text{max_k}}$$
When $k>r$, let $\lambda\rightarrow \infty$, then $g(\lambda)\rightarrow \infty$. When $k\leq r$, $\max_{\lambda\geq 0} g(\lambda)$ is equivalent to $-\|x\|_{\text{max_k}}$, where $k\leq r$ and the largest $k$ absolute values of $x$ should be larger than $\lambda$. My intuition tells me that the dual problem should just be $-\|x\|$, but I don't know how to prove it clear.

And for (c), I don't know how to use the results in (b) to write it in QP. First the maximize and minimize seems contradiction. Second it seems that there is $x^Ty$ in the objective function, and how can I deal with this term?

Best Answer

Partial answer, to show the value of the dual in (b) is $-\|x\|$:

We have $ g(\lambda) = \sum_k m_k(\lambda) - \lambda r$, where $m_k(\lambda) = \min(0, \lambda-|x_k|)$. Note that $g$ is concave, and has a $\max$ on $[0,\infty)$. At a $\max$, we must have $0 \in \partial g(\lambda)$, where $\partial g(\lambda)$ is the super gradient. We see that $\partial g(\lambda) = \sum_k \partial m_k(\lambda) - \{r \}$. A little computation gives $\partial m_k(\lambda) = \begin{cases} \{1\},& \lambda < |x_k|, \\ [0,1],& \lambda = |x_k|, \\ \{0\},& \lambda > |x_k|\end{cases}$.

Hence we see that $\partial g(\lambda) = n_<(\lambda) +[0, n_=(\lambda)] - \{r \}$, where $I_R(\lambda) = \{ k | \lambda \ R\ |x_k| \}$ and $n_R(\lambda) = |I_R|$. In other words, $n_<(\lambda) \le r \le n_<(\lambda) + n_=(\lambda)$.

Let $J$ be a $r-n_<(\lambda)$ element subset of $I_=(\lambda)$, and let $I = I_<(\lambda) \cup J$. Then we have $g(\lambda) = \sum_{k \in I} (\lambda - |x_k|) - r \lambda = - \sum_{k \in I} |x_k|$.

Now note that $|x_k| \ge \lambda \ge |x_j|$ for all $k \in I, j \notin I$ and $|I| = r$. Hence $g(\lambda ) = - \| x\|$.

Related Question