[Math] Finding convex conjugate of a bounded function

convex optimizationconvex-analysis

The convex conjugate of a function $f:\mathcal{X}\mapsto \mathbb{R}$ is formally defined as $$f^\star\left(y\right)=\sup_{x\in\mathcal{X}}\ \left\langle x,y\right\rangle-f\left(x\right).$$

In cases where $f\left(\cdot\right)$ is bounded, the supremum can become unbounded (e.g., $\mathcal{X}=\mathbb{R}^n$). To avoid this problem, $x$ is further constrained to a bounded set as I've read in textbooks and papers.

My question is that if there's a principle on how to bound $x$, because different constraints result in different convex conjugate functions. For instance, to find convex conjugate of $f\left(x\right)=\left\Vert x\right\Vert_0=\sum_{i=1}^n 1(x_i\neq 0)$ with $\mathcal{X}=\mathbb{R}^n$, it does make a difference to choose $\left\Vert x\right\Vert_{+\infty}\leq 1$ or $\left\Vert x\right\Vert_2 \leq 1$ as the bounding constraint.

Best Answer

Now that I understand the context better I am going to promote my own comments to an answer.

There is no general principle for artificially restricting the domain of a function when computing its convex conjugate. In fact, you should not do so unless you have a compelling and mathematically sound reason in your specific application. The domain of the conjugate function---that is, the set of $y$ for which the supremum is finite---is an important part of the conjugate itself, and should not be discarded. For example, when computing the dual of a convex optimization problem with an objective $f(x)$, the domain information imposes implicit constraints on the dual variables. If you remove that information, the dual problem is incorrect; it no longer provides bounds for the primal problem.

So what's going on in Chapter 5 of Ms. Fazel's thesis, or in the case of the so-called $\ell_0$ norm? Note that the interest is not in the conjugate of these functions, but rather in their convex envelopes. It just so happens that the convex envelope of a function can computed using the conjugate of the conjugate. Unfortunately, the convex envelopes of $f(x)=\mathop{\textbf{card}}(x)$ or $g(X)=\mathop{\textbf{rank}}(X)$ are not very interesting---in fact, I think they are identically zero. On other hand, the modified, extended-valued functions $$\bar{f}(x) = \begin{cases} \mathop{\textbf{card}}(x) & \|x\|_\infty \leq 1 \\ +\infty &\|x\|_\infty > 1 \end{cases}, \qquad \bar{g}(X) = \begin{cases} \mathop{\textbf{rank}}(x) & \|X\|_2 \leq 1 \\ +\infty &\|X\|_2 > 1 \end{cases}$$ have non-trivial convex envelopes.

As for why one would want these convex envelopes, it is because they provide some theoretical justification for why $\bar{f}^{**}(x)=\|x\|_1$ and $\bar{g}^{**}(X)=\|X\|_*$ are effective convex proxies for their non-convex counterparts. As you know, Ms. Fazel's thesis is entitled Matrix Rank Minimization with Applications, and it makes heavy use of trace minimization and nuclear norm minimization to find low-rank matrices that satisfy the modeling conditions. Section 5.1 is devoted to providing justifications for the convex heuristics that she uses throughout the work.

Added to clarify: you have expressed an interest in a comment above in finding the "tightest" convex envelope of a function. It's very important to note that if you truly want the tightest convex envelope for the entire function, you cannot restrict the domain in any way before taking the double conjugate. When you impose a domain restriction, you are computing the convex envelope of a different function. It will no longer serve as a lower bound for the original function. It will, however, be a tighter envelope over the domain you have selected.

You can see this for yourself: look at $f(x)=\mathop{\textbf{card}}(x)$ and $g(x)=\|x\|_1$. Clearly, there are values of $x$ for which $f(x)<g(x)$. So $g$ cannot be the convex envelope for $f$. It is, however, the convex envelope if you restrict the domain of $f$ to $\{x\,|\,\|x\|_\infty \leq 1\}$ (as we did with $\bar{f}$ above).

Related Question