Linear Algebra – Conjugate Function of Log-Sum-Exp Explained

convex-analysislinear algebraproof-verification

In Boyd's Convex Optimization book, Example 3.25 finds the conjugate function $f^*(y):=\sup_{x\in\text{dom}(f)}(y^Tx-f(x))$ of the log-sum-exp function $f(x):=\log(\sum_{i=1}^ne^{x_i})$. First, the gradient of $y^Tx-f(x)$ is taken to yield the condition:

$$
y_i=\frac{e^{x_i}}{\sum_{j=1}^ne^{x_j}}\quad i=1,…,n
$$

where we see that a solution for $y$ exists if and only if $y\succ 0$ and $\textbf{1}^Ty=1$. Then the book simply says:

By substituting the expression for $y_i$ into $y^Tx-f(x)$ we obtain $f^*(y)=\sum_{i=1}^ny_i\log(y_i)$.

So far I've been unsuccessful in deriving this. How does one proceed? All I see is:

$$
y^Tx-f(x)=\sum_{i=1}^ny_ix_i-\log(\sum_{i=1}^ne^{x_i})=\frac{\sum_{i=1}^nx_ie^{x_i}}{\sum_{j=1}^ne^{x_j}}-\log(\sum_{i=1}^ne^{x_i})
$$

But from here on I do not knonw how to proceed.

Best Answer

It's perhaps easier to substitute the expression for $x_i$ in terms of $y_i$: $$y_i=\frac{e^{x_i}}{\sum_{j=1}^ne^{x_j}}=\frac{e^{x_i}}{e^{f(x)}} \Leftrightarrow x_i = \log y_i + f(x)$$ Then using the fact that $1^Ty=1$ the expression simplifies to: $$ \begin{aligned} y^Tx - f(x) &= \sum_{i=1}^n y_i x_i - f(x) \\ &= \sum_{i=1}^n y_i (\log y_i + f(x)) -f(x) \\ &= \sum_{i=1}^n y_i \log y_i + \sum_{i=1}^n y_i f(x) -f(x)\\ &= \sum_{i=1}^n y_i \log y_i \end{aligned} $$

Related Question