[Math] Lipschitz Smoothness, Strong Convexity and the Hessian

convex-analysisfunctions

I'm working with the following two concepts:

  • Lipschitz Smoothness – a function $f$ is Lipschitz smooth with constant $L$ if its derivatives are Lipschitz continuous with constant $L$, in other words if for any $x$ and $y$, $$ \| \nabla f(x) – \nabla f(y) \| \leq L \| x – y \| $$
  • Strong Convexity – a function $f$ is $\alpha$-strongly convex if $$ \nabla^2 f(x) \succeq \alpha I $$ for all $x$, where $I$ is the identity matrix.

Here are my questions:

  1. I know that Lipschitz Smoothness implies that for any $x$ and $y$, it is true that $$ f(x + y) \leq f(x) + y^\top \nabla f(x) + \frac{L}{2} \| y \|^2 $$ Is the converse also true? ie: is it an "if and only if"?
  2. I read somewhere that $f$ is Lipschitz smooth if and only if $$ L I \succeq \nabla^2 f(x) $$ for all $x$, where $I$ is the identity matrix. How can I prove that?
  3. I read somewhere that $f$ is $\alpha$-strongly convex if and only if for any $x$ and $y$, it is true that $$ f(x+y) \leq f(x) + y^\top\nabla f(x) – \frac{\alpha}{2} \| x – y \| $$ How can I prove this, and is it and if-and-only-if?
  4. The previous two points seem to imply some relationship between these two concepts. Does it go any deeper?

I realize this is a bunch of questions – if anyone has a good reference on these topics, that'd be swell too…

Thanks

Best Answer

I assume that your question concerns convex functions only; without convexity much of it would be false.

Question 2: strictly speaking, being Lipschitz smooth ($C^{1,1}$) does not imply $\nabla^2 f$ exists. But the statement is true if we interpret $\nabla^2 f\preceq LI$ as holding almost everywhere. Indeed, $\nabla^2 f$ is a positive semidefinite matrix, so having $\nabla^2 f\preceq LI$ a.e. is equivalent to $\nabla^2 f\in L^\infty$. And it is well known that having $L^\infty$ derivative is equivalent to being Lipschitz; thus $$\nabla^2 f\in L^\infty \iff \nabla f\in C^{0,1} \iff f\in C^{1,1}$$

Question 3: You misremembered. The correct inequality characterizing $\alpha$-strong convexity is $$f(x+y) \ge f(x) + y^\top\nabla f(x) + \frac{\alpha}{2} \| x - y \|^2 \tag{1}$$ Indeed, (1) is equivalent to saying that the function $g(x)=f(x)-\frac{\alpha}{2} \| x \|^2$ is convex. The latter is equivalent to $\nabla^2 g\succeq 0$, which is $\nabla^2 f\succeq \alpha\, I$.

Question 4. Yes, there is a direct and important relation: a function is strongly convex if and only if its convex conjugate (a.k.a. Legendre-Fenchel transform) is Lipschitz smooth. Indeed, the gradients maps are inverses of each other, which implies that the Hessian of convex conjugate of $f$ is the inverse of the Hessian of $f$ (at an appropriate point). So, a uniform upper bound on $\nabla^2 f$ is equivalent to a uniform lower bound on $\nabla^2 (f^{*}) $, and vice versa. One can also argue without referring to the Hessian (which may fail to exist at some points): the Lipschitz smoothness of $f$, by your item 1, gives us at every $x_0$ a quadratic function $q$ so that $q(x_0)=f(x_0)$ and $f \le q$ everywhere. Taking convex conjugate reverses the order: $q^*\le f^*$; and this means that $f^*$ is strongly convex.

Question 1. The converse is true, but the only proof I see goes through the convex conjugate as described in Q4. Since strong convexity is characterized by the comparison property (1), taking the conjugate gives a matching characterization of Lipschitz smoothness.

Reference: Chapter 5 of Convex functions by Jonathan M. Borwein and Jon D. Vanderwerff.