Compressed sensing and optimality of dual certificate

convex optimizationstatistics

I am going through a survey from Emmanuel Candes on compressed sensing [1]. The setup behind compressed sensing can be described as follows:

Consider the optimization problem in $\mathbb{R}^d$:

$$\min \|x\|_1$$
$$\text{subject to: } Ax = y$$

Let a solution to this optimization problem be a sparse vector $x^*$ with nonzero entries indexed by a set $T$. Given a feasible point to the above optimization problem, we would like to show that this point is optimal. In order to do so, [1] introduces a "ansatz" optimization problem at the beginning of section 4:

$$\min \|v\|_2 $$
$$\text{subject to: } v \perp \text{null}(A) \text{ and } P_Tv = \text{sign}(x^*)$$

where $T$ is the subspace of vectors with the same support as $x^*$ and $A$ is a linear map such that the $\text{null}(A) \cap T = \{0\}$.

$P_Lu$ is the projection of $u$ onto the subspace $L$. $\text{sign}(x)_i = x_i/|x_i|$ if $x_i \neq 0$ and $0$ otherwise.

[1] mentions if there exists a feasible $v$ for the ansatz problem which also satisfies $\|P_{T^\perp} v\|_\infty < 1$, then $x^*$ is optimal for the original optimization problem, but I think we dont need a strict inequality here ($T^\perp$ is the orthogonal complement of $T$, the vectors supported at different indices than $x^*$.)
We can show this by applying the KKT conditions to check the optimality of $x^*$:

$$\mathcal{L}(x, \lambda) = \|x\|_1 + \lambda^T(Ax – y)$$

$$\partial \|x\|_1 + A^T\lambda = 0$$
So $x^*$ is optimal if there exists a $u \in \text{range}(A)$ that is also an element of the subdifferential of $\|\cdot\|_1$ at $x^*$. This is exactly the conditions of the ansatz problem in addition to $\|P_{T^\perp} v\|_\infty \leq 1$ (note the inequality rather than strict inequality).

This paper mentions that they introduce this ansatz problem since it can be studied analytically and "by minimizing its Euclidean norm we hope to make its dual norm small as well". But the dual of the Euclidean norm is also the Euclidian norm (not the $1$ or $\infty$ norm) so I'm unsure what they are really after.

Why is this ansatz problem even introduced? In particular, why minimize the $\ell_2$ norm at all? It seems we just want a feasible point.

EDIT: Reading on, this problem is introduced in order to compute some feasible point in closed form. Such a point is amenable to the analysis that follows, showing that such a certificate exists with high probability.

[1] Candès, Emmanuel J. "Mathematics of sparsity (and a few other things)." Proceedings of the International Congress of Mathematicians, Seoul, South Korea. Vol. 123. Citesee, 2014.

Best Answer

Let $v$ be a solution to the ansatz problem in the original post.

It is true that $\|P_{T^\perp} v\|_\infty \leq 1$ is sufficient to show that $x^*$ is a solution to the $\ell^1$ optimization problem. Nonetheless, this may not be satisfactory: there could be other solutions to the problem that have nothing to do with $x^*$.

Notably, $\|P_{T^\perp} v\|_\infty < 1$ suggests that $x^*$ is further the unique solution of the $\ell^1$ optimization, which fixes the above issue. A proof is given in page 9-38 of this set of notes.

Related Question