The conjugate function of infimum of sum of functions

convex optimizationconvex-analysisduality-theorems

Given convex functions $f_1, \ldots, f_k$. Given $g(x)$, such that:

$$g(x) = \inf_{x_1, \ldots, x_k} \left\{ f_1(x_1) + \ldots + f_k(x_k) | x_1 + \ldots + x_k = x \right\}$$

Find $g^*(x)$.


I found this similar question and did similar steps (you may find them below):


By definition:
$$f^*(y)
= \sup_{x \in dom g} (y^T x – f(x))
$$

Then
$$g^*(x)
= \sup_{x \in dom g} (y^T x – \inf_{x_1, \ldots, x_k} \left\{ f_1(x_1) + \ldots + f_k(x_k) | x_1 + \ldots + x_k = x \right\}
= \ldots$$

Using the fact that
$$\sup (-f(x)) = – \inf (f(x))$$

Then
$$\ldots
= \sup_{x \in dom g} (y^T x + \sup_{x_1, \ldots, x_k} \left\{ -f_1(x_1) – \ldots – f_k(x_k) | x_1 + \ldots + x_k = x \right\}
=
$$

$$
= \sup_{x \in dom g, x_1, \ldots, x_k} (y^T x + \left\{ -f_1(x_1) – \ldots – f_k(x_k) | x_1 + \ldots + x_k = x \right\}
$$


After these steps some questions arose:

  1. Why the author removed constraint $x_1 + \ldots
    + x_k = x$
    in the solution?
  2. Why
    $$\sup_{x, x_1, \ldots, x_k} \{f(x) | x_1 + \ldots + x_k = x \} = \sup_{x_1, \ldots, x_k} \{f(x) | x_1 + \ldots + x_k = x \}$$
    If so, how to prove it?

Best Answer

I'll just write down the full solution to your problem, and explain all the steps in the derivations.

So, let $f_1,\ldots,f_k:\mathcal H \rightarrow (-\infty,+\infty]$ be functions (convex or not) on a Hilbert space $\mathcal H$. Define $g(x) := \inf_{x_1,\ldots,x_k}\left\{\sum_i f_i(x_i) \mid \sum_i x_i = x\right\}. $ The function $g$ is also called the infimal convolution of $f_1,\ldots,f_k$, written $g=\Box_{i=1}^k f_i$.

Now, one computes $$ \begin{split} g^*(x) &:= \sup_y x^Ty - g(y) \overset{(a)}{=} \sup_{y}x^Ty - \inf_{x_1,\ldots,x_k}\left\{\sum_i f_i(x_i) \mid \sum_i x_i = y\right\}\\ &\overset{(b)}{=} \sup_{y}x^Ty + \sup_{x_1,\ldots,x_k}\left\{-\sum_i f_i(x_i) \mid \sum_i x_i = y\right\}\\ & \overset{(c)}{=} \sup_{y,x_1,\ldots,x_k}\left\{x^Ty -\sum_i f_i(x_i) \mid \sum_i x_i = y\right\} \\ &\overset{(d)}{=} \sup_{y,x_1,\ldots,x_k}x^T\sum_i x_i -\sum_i f_i(x_i) \overset{(e)}{=} \sup_{y,x_1,\ldots,x_k}\sum_i (x^Tx_i -f_i(x_i)) \overset{(f)}{=} \sum_i\sup_{x_i}x^Tx_i - f_i(x_i)\\ &\overset{(*)}{=} \sum_i f_i^*(x), \end{split} $$ where

  • (a) is just plugging-in the definition of $g(y)$
  • (b) is because $-\inf something = -\sup-thatthing$
  • (c) is because $\sup_a u(a) + \sup_b u(b) = \sup_{a,b}(u(a) + u(b))$
  • (d) is just substituting the constraint $y=\sum_i x_i$ to get $x^Ty = x^T\left(\sum_i x_i\right)$. After this substitution, the variable $y$ doesn't play a role anymore in the maximization, and so can be deleted. Indeed, $\sup_{a,b}u(a) = \sup_a u(a)$.
  • (e) is because $x^T\sum_i x_i - \sum_i f_i(x_i) = \sum_i x^Tx_i - \sum_i f_i(x_i)=\sum_i(x^Tx_i - f_i(x_i))$
  • (f) is because $\sup_{a,b}u(a) + u(b) = \sup_a u(a) + \sup_b u(b)$
  • (*) is because $f_i^*(x) := \sup_{x_i}x^Tx_i - f(x_i)$ by definition.

Therefore we have proven that

Convex conjugate of infimal convolution. $$ (\Box_{i=1}^kf_i)^*(x) = \sum_{i=1}^k f_i^*(x)\; \forall x $$

Related Question