Real Analysis – Relationship Between Convexity and Superadditivity

This question is a little vague, so let me give some motivation. I was trying to prove the generalized Holder's inequality for probability measures,
$$\mathbb{E}(X_1 \dots X_n) \leq \prod_{i=1}^n \|X_i\|_{p_i},$$
where $\sum_{i=1}^n (1/p_i) = 1$, the $X_i$ are random variables, and
$$\|X_i\|_{p_i} = \left(\int_\Omega |f(x)|^{p_i}dx\right)^{1/p_i}.$$
To prove this you can use the arithmetic-geometric mean,
$$\sum_{i=1}^n x_i^{p_i} \leq \sum_{i=1}^n p_i x_i,$$
where $x_i > 0$ and $\sum p_i = 1$.
In getting through all this I started investigating Jensen's inequality for convex functions. I've known convexity was important but I've never grappled with it from an analytic perspective, so I don't really know the tricks / typical arguments. I tried to prove the AM-GM mean by generalising the proof of Holder for $(1/p) + (1/q) = 1$, which relies on Young's inequality,
$$ab \leq \frac{1}{p}a^p + \frac{1}{q}b^q,$$
whence Holder's follows from a clever choice of $a, b$. However I don't see any ways to extend the definition of convexity, which is that for any $\lambda \in [0,1]$, we require
$$f(\lambda x + (1-\lambda) y) \leq \lambda f(x) + (1-\lambda) f(y),$$
to something which would involve multiple $\lambda$s which sum to 1.
Another definition of convexity is existence of a vector $V \in \mathbb{R}^n$ such that
$$f(y) \geq f(x) + V\cdot(y-x)$$
for all $y \in \mathbb{R}^n$ or
$$f(x) = \sup_{z \in \mathbb{R}} L_z(x)$$
where the $L_z$ are the linear support planes for the convex function $f$. I understand all these definitions separately and more or less together, but it feels like I should be able to say something like arithmetic-geometric mean (which uses Jensen's inequality) straight from what of these inequalities. I was thinking maybe the exponential could satisfy something like this:
$$f(x+y) \geq f(x) + f(y),$$
so by induction and with $f(x) = e^x$,
$$f(a+b+\dots+z) \leq f(a)+f(b)+ \dots + f(z).$$
This is ''superadditivity' and kind of says $f$ keeps growing faster and faster. One soon discovers this property is false, even for the exponential ( $x = 0$). Does it hold for $x >1$? What is the relationship between
$$f(x+y) \geq f(x) + f(y)$$
and
$$f\left(\frac{1}{2}x + \frac{1}{2}y\right) \leq \frac{1}{2}f(x) + \frac{1}{2}f(y)?$$

I'd also like to understand Jensen's inequality a little better. I've heard it kind of generalises the triangle inequality, since $\int |f| \geq |\int f|$ is like $\sum |x_i| \geq |\sum x_i|$. This makes sense, and I have some examples of this working (insightful ones are appreciated), but I don't really see what it's saying.

Thanks for any help.

@Eric Convex function with $f(0) = 0$ is superadditive on $[0,\infty)$, i.e.
$$f(x+y) \geq f(x)+f(y), \quad x, y \geq 0.$$

First, note that if $f(0) = 0$ and $f$ is convex, i.e. $f(tx + (1-t)y) \leq t f(x) + (1-t)f(y)$, then:
\begin{align*}
f(tx) &= f(tx + (1-t)\cdot 0) \\
&\leq tf(x) + (1-t)f(0)\\
&= tf(x).
\end{align*}

Hence, for $a,b \geq 0$, $\frac{a}{a+b}\in [0,1]$ and likewise for $\frac{b}{a+b}$. Thus,
\begin{align*}
f(a) + f(b) &= f((a+b)\frac{a}{a+b}) + f((a+b)\frac{b}{a+b}) \\
&\leq \frac{a}{a+b}f(a+b) + \frac{b}{a+b}f(a+b) \\
&= f(a+b).
\end{align*}

Looks right to me?

Best Answer

Convexity and superadditivity: If a function $f$ is convex, increasing, and $f(0)=0$, then $f$ is superadditive. To see this, draw the secant line through $(0,0)$ and $(x+y,f(x+y))$. This is the graph of a linear (additive!) function $\ell(x)$. By convexity, $f(x)\le \ell(x)$ and $f(y)\le \ell(y)$, hence the superadditivity.

Superadditivity does not imply convexity. To see this, take a convex function such as $f(x)=x^2$ and cut off a small piece of its graph: say, $\tilde f(x)= \max(x^2,2x-1+\epsilon)$ where $\epsilon >0$ is small. This is no longer strictly convex (the graph contains a line segment), but is still strictly superadditive (the line segment is too short to contain $x,y,x+y$). Now replace the line segment with a slightly concave curve, and you have a non-convex superadditive function.

Generalized Hölder's inequality follows by induction from the ordinary one. Let $Y=X_1\cdots X_{n-1}$. Then
$$\mathbb E(YX_n)\le \| Y\|_q\|X_n\|_{p_n}\tag1$$ where $q=p_n/(p_n-1)$. Write $\tilde p_i=p_i/q $ and note that $$\sum_{i=1}^{n-1} 1/\tilde p_i =(1-1/p_n)/q=1 \tag2$$ By the inductive hypothesis, $$ \| Y\|_q^q = \mathbb E(|Y|^q)\le \prod_{i=1}^{n-1}\||X_i|^q\|_{\tilde p_i} =\prod_{i=1}^{n-1}\|X_i \|_{ p_i}^q \tag3 $$ which together with (1) completes the proof.

Jensen's inequality: To me, it is a statement about the relation between convex functions and affine functions: members of the former family are formed by taking supremum of the latter. The proof of inequality expresses this relation better than the inequality itself. Also, this quote from Gowers comes to mind:

One will not get anywhere in graph theory by sitting in an armchair and trying to understand graphs better.

Best Answer

Related Solutions

Convex Analysis – Convexity Definition Confusion

Proving convexity of the negative log complementary probability: $-\log\left(1 – \frac{\exp(x_i)}{ \sum_j \exp(x_j)}\right)$

Related Question