[Math] Why is this composition of concave and convex functions concave

convex optimizationconvex-analysisfunction-and-relation-composition

Please forgive my ignorance. I have a quick silly question about a statement given without proof in Convex Optimization by Boyd and Vandenberghe (page 87).

Suppose $\mathbb{R}_+^n$ is the set of vectors $v$ such that every coordinate of $v$ is real-valued and non-negative. Suppose $z \in \mathbb{R}_+^n$. Suppose that $z_i$ refers to the $i^{th}$ coordinate of $z$. Suppose $x$ is a scalar such that $x \geq 0$ and suppose that $p$ is a scalar such that $0 < p \leq 1$.

Consider the functions,

\begin{align}
i(z) & = \sum_{i=1}^{n} {z_i^p}
\\\\
j(x) & = x^{\frac{1}{p}}
\end{align}

Boyd and Vandenberghe claim without proof that the function,

$$h(z) = j(i(z))=\left( \sum_{i=1}^{n} {z_i^p} \right)^{\frac{1}{p}}$$

is a concave function of $z$. Why is this true?

I understand that $z_i^p$ is a non-decreasing concave function of $z_i$. Therefore $i(z)$ is a concave function of $z$. I also understand that $j(x)$ is a non-decreasing convex function of $x$.

However, this is where my understanding breaks down. Since $i(z)$ is concave and non-decreasing in each $z_i$, but $j(x)$ is convex, I don't know of any composition rule I can apply to arrive at the conclusion that $h(z)$ is concave. What am I missing?

Best Answer

The convex function $j$ of a concave function $i$ is not necessarily concave. For example, if $j$ is strictly convex and $i$ is a constant function, then $j\circ i$ is strictly convex.

In your case, the $p$-"norm" is concave when $p<1$ because the Hessian matrix is negative semidefinite. More specifically, let $S=\sum z_i^p$. Then $$ \frac{\partial^2 S^{1/p}}{\partial z_i \partial z_j} =(1-p)S^{1/p-2} \left( z_i^{p-1}z_j^{p-1} - S z_i^{p-2} \delta_{ij} \right). $$ So the Hessian matrix is given by $H=(1-p)S^{1/p-2} D(uu^T-SI)D$, where $u=(z_1^{p/2},\ldots,z_n^{p/2})^T$ and $D=\operatorname{diag}(z_1^{p/2-1},\ldots,z_n^{p/2-1})$. As the eigenvalues of the matrix $uu^T-SI$ are $0$ (simple eigenvalue) and $-S$ (with multiplicity $n-1$), $H$ is negative semidefinite.

Related Question