Lengendre-Fenchel transform of infimal convolution

Wikipedia states the following interesting fact about the Lengendre-Fenchel transform:

Define the infimal convolution of two functions as
$$
(f\square g)(x) = \inf_y\big\{f(x-y) + g(y)\big\}
$$

Then, with certain caveats listed below, the Legendre-Fenchel transform ${}^*$ has the following property:
$$
(f_1\square\dots\square f_n)^*(x^*) = f_1^*(x^*) + \dots + f_n^*(x^*).
$$

I worked my way through this and it's not hard to see why it's true. I'll give my sketch proof below.

However, Wikipedia states that it's true only if the functions $f_1, \dots, f_n$ are proper convex functions and lower semicontinuous, and states also that the right hand side might not be a proper convex function.

What I want to understand is:

Why these are the exact conditions for the theorem, and what happens if the functions fail to meet them?
What conditions are needed to go the other way? That is, what conditions do functions $f_1, \dots, f_n$ need to satisfy in order to guarantee that $(f_1 + \dots + f_n)^* = f_1^* \square \dots \square f_n^*$?

I don't have access to the book Wikipedia cites, so I'm wondering if
someone can post (or point me toward) a more detailed proof than my
sketch below, which shows exactly where those assumptions are used.

Here's my sketch of the proof, for two functions. We have

$$
(f\square g)^*(x) = \sup_x \Big\{\langle
x^*,x \rangle – \inf_y\big\{ f(x-y) + g(y) \big\} \Big\} \\
= \sup_{x,y}\big\{ \langle
x^*,x \rangle – f(x-y) – g(y) \big\}.
$$
Let $k=x-y$. Then
$$
(f\square g)^*(x) = \sup_{k,y}\big\{ \langle
x^*,k+y \rangle – f(k) – g(y) \big\} \\
= \sup_{k}\big\{ \langle
x^*,k \rangle – f(k) \big\} + \sup_{y}\big\{ \langle
x^*,y \rangle – f(y) \big\}\\
= f^*(x^*) + g^*(x^*).
$$

Best Answer

This theorem has been well proved in Rockafaller's book ``Convex Analysis'' (Theorem 16.4). In his theorem, it does not need $f_1,\dots,f_m$ to be closed (lower-semicontinuous). And the infimal convolution is defined on proper convex functions, thus we need $f_1,\dots,f_m$ to be proper.

In the other way, i.e., to prove $(f_1+\dots+f_m)^*=f_1^* \square \dots \square f_m^*$. We need one more condition: the relative interior of effective domain, $\operatorname{ri}(\operatorname{dom}f_i),i=1,\dots,m$, should have a point in common.

I am very glad to copy his theorem (Theorem 16.4 in his book) here.

Let $f_1,\dots,f_m$ be proper convex functions on $\mathbb{R}^n$, Then \begin{equation} \begin{aligned} \left(f_{1} \square \cdots \square f_{m}\right)^{*}, &=f_{1}^{*}+\cdots+f_{m}^{*} \\\left(\mathrm{cl} f_{1}+\cdots+\mathrm{cl} f_{m}\right)^{*} &=\mathrm{cl}\left(f_{1}^{*} \square \cdots \square f_{m}^{*}\right). \end{aligned} \end{equation} If the sets $\operatorname{ri}(\operatorname{dom}f_i),i=1,\dots,m$, have a point in common, the closure operation can be omitted from the second formula, and \begin{equation*} \left(f_{1}+\cdots+f_{m}\right)^{*}(\left.x^{*}\right) =\inf \left\{f_{1}^{*}\left(x_{1}^{*}\right)+\cdots+f_{m}^{*}\left(x_{m}^{*}\right) | x_{1}^{*}+\cdots+x_{m}^{*}=x^{*}\right\} , \end{equation*} where for each $x^*$ the infimum is attained.

Best Answer

Related Solutions

Fenchel Conjugate of the Hinge Loss

Related Question