Fenchel Conjugate of the Hinge Loss

analysisconvex optimizationconvex-analysisoptimization

Let $\ell_H: \mathbb R \rightarrow \mathbb R_{\infty}, x\mapsto \max\{0, 1-x\}$, which is the so-called "hinge loss function" (which is important in the context of machine learning and SVM's, but that's not what I want to talk about here).

In an exercise, we're supposed to calculate the so-called "Fenchel conjugate" of this function, which was defined in our lecture notes as follows:

Definition: Let $J: X\rightarrow\mathbb R_{\infty}$ be proper. The convex conjugate (or Fenchel conjugate) $J^{\star}$ is defined as $$J^{\star}(p) = \sup_{x\in X}\left(\left\langle p | x\right\rangle – J(x)\right).$$

One thing missing in the Def. is that $J^{\star}: X^{\star} \rightarrow \mathbb R_{\infty}$, so $J^{\star}$ "operates" on the dual space of $X$.. The solution is supposed to be $$\ell_{H}^{\star}(p) = \begin{cases} p, \quad p\in \left[-1, 0\right], \\ \infty, \ \ \text{otherwise} \end{cases},$$ but I don't obtain this (yet). Here is my approach:

$$\ell^{\star}_{H}(p) = \sup_{x\in \mathbb R}\{ \langle p | x \rangle- \ell_{H}(x) \} = \sup_{x\in \mathbb R}\{ \langle p | x \rangle- \max\{0, 1-x\}\}.$$
To the best of my understanding, $\langle p| x \rangle = p(x)$, since $p\in \mathbb R^{\star}$, and since we can write every element $p$ of $\mathbb R^{\star}$ in the form $p(x) = mx$, where $m\in \mathbb{R}$, we get:

$$\ell^{\star}_{H}(p) = \sup_{x\in \mathbb R}\{ \langle p | x \rangle- \ell_{H}(x) \} = \sup_{x\in \mathbb R}\{ mx – \max\{0, 1-x\}\}.$$

For $x\geq 1$, we have $\max\{0, 1-x\} \geq 0$, and thus $mx – \max\{0, 1-x\} = mx – (1 – x) = (m+1)x – 1$. Thus, for $x\geq 1$, the $\sup$ yields $\infty$, and for $x < 0$, it should yield the same (because then we only have $mx – 0 = mx$).

Apparently, my thinking is wrong, which leads me to the following questions:

(1) What does it (in the solution from above) mean that $p\in [-1, 0]$? After all, $p$ is an element of the dual space $\mathbb R^{\star}$. Maybe it means that $p(x) \in [-1, 0]$?

(2) Why does the Fenchel conjugate yield $p$ for $p\in [-1, 0]$?

Best Answer

Your argument starts right, but then you make a wrong turn. You should start with some fixed $m$ and then check where the function $h(x) = mx - \max(0,1-x)$ has its supremum and when you check, you consider the cases $1-x>0$ and $<0$. If $m>0$, then the function $h$ is unbounded from above (for $x\to\infty$). if $m\leq 0$, then it depends on how negative $m$ is: For $m<-1$ the function $h$ is again unbounded from above (this time for $x\to-\infty$). Finally, check the case $-1\leq m\leq 0$ and note that $h(x)\leq 0$ with $h(0)=0$. All this together shows that the conjugate looks as claimed.

Related Question