Wikipedia states the following interesting fact about the Lengendre-Fenchel transform:
Define the infimal convolution of two functions as
$$
(f\square g)(x) = \inf_y\big\{f(x-y) + g(y)\big\}
$$
Then, with certain caveats listed below, the Legendre-Fenchel transform ${}^*$ has the following property:
$$
(f_1\square\dots\square f_n)^*(x^*) = f_1^*(x^*) + \dots + f_n^*(x^*).
$$
I worked my way through this and it's not hard to see why it's true. I'll give my sketch proof below.
However, Wikipedia states that it's true only if the functions $f_1, \dots, f_n$ are proper convex functions and lower semicontinuous, and states also that the right hand side might not be a proper convex function.
What I want to understand is:
-
Why these are the exact conditions for the theorem, and what happens if the functions fail to meet them?
-
What conditions are needed to go the other way? That is, what conditions do functions $f_1, \dots, f_n$ need to satisfy in order to guarantee that $(f_1 + \dots + f_n)^* = f_1^* \square \dots \square f_n^*$?
I don't have access to the book Wikipedia cites, so I'm wondering if
someone can post (or point me toward) a more detailed proof than my
sketch below, which shows exactly where those assumptions are used.
Here's my sketch of the proof, for two functions. We have
$$
(f\square g)^*(x) = \sup_x \Big\{\langle
x^*,x \rangle – \inf_y\big\{ f(x-y) + g(y) \big\} \Big\} \\
= \sup_{x,y}\big\{ \langle
x^*,x \rangle – f(x-y) – g(y) \big\}.
$$
Let $k=x-y$. Then
$$
(f\square g)^*(x) = \sup_{k,y}\big\{ \langle
x^*,k+y \rangle – f(k) – g(y) \big\} \\
= \sup_{k}\big\{ \langle
x^*,k \rangle – f(k) \big\} + \sup_{y}\big\{ \langle
x^*,y \rangle – f(y) \big\}\\
= f^*(x^*) + g^*(x^*).
$$
Best Answer
This theorem has been well proved in Rockafaller's book ``Convex Analysis'' (Theorem 16.4). In his theorem, it does not need $f_1,\dots,f_m$ to be closed (lower-semicontinuous). And the infimal convolution is defined on proper convex functions, thus we need $f_1,\dots,f_m$ to be proper.
In the other way, i.e., to prove $(f_1+\dots+f_m)^*=f_1^* \square \dots \square f_m^*$. We need one more condition: the relative interior of effective domain, $\operatorname{ri}(\operatorname{dom}f_i),i=1,\dots,m$, should have a point in common.
I am very glad to copy his theorem (Theorem 16.4 in his book) here.