[Math] Why does the Hessian matrix have to be positive semidefinite for convexity

convex optimizationconvex-analysishessian-matrixoptimizationpositive-semidefinite

I understand that for a function $f(x), f \in \mathbb{R}, x \in \mathbb{R}$ to be convex (has a minimum) the second derivative $\frac{\partial^2 f}{\partial x^2}$ has to be $\ge0$. But I don't see for the function $f(\mathbf{x}), f\in \mathbb{R}, \mathbf{x} \in \mathbb{R}^d$ how can the Hessian $\mathbf{H}$ matrix of $f$ being positive semidefinite guarantees a minimum.

Best Answer

A convex function doesn't have to be twice differentiable; in fact, it doesn't have to be differentiable even once. For instance, $f(x)=|x|$ is not differentiable at the origin, and that's its minimum! We can, however, say this: the Hessian of a convex function must have be positive semidefinite wherever it is defined.

Furthermore, a convex function doesn't have to have a minimum. Take, for instance, the function $f(x)=e^x$. It is bounded below, and it has an infimum; $\inf_x f(x) = 0$. But at no value of $x$ does it achieve this value, so it has no minimum. Note that it is differentiable everywhere, and its second derivative is strictly positive.

It's worse than this: a convex function, even one that has a positive definite Hessian, need not have an infimum. Take, for instance, the function $f:\mathbb{R}_{++}\rightarrow\mathbb{R}$, $f(x)=-\log x$. The second derivative $1/x^2$ is positive on the entire domain of the function, but it is not bounded.

What you can say is that any convex function that is bounded below has an infimum; but again, that alone doesn't guarantee a minimum can be attained. For that you need something more. One common sufficient, but not necessary, condition is strong convexity. The basic definition is $$f(tx+(1-t)y) \geq tf(x) + (1-t)f(y) - \tfrac{1}{2}m t(1-t)\|x-y\|^2$$ where $m>0$ is fixed; any norm can be used, but the Euclidean norm is common. Note that, again, this does not assume that $f$ is differentiable. If it is twice differentiable at a point, however, strong convexity implies $\nabla^2 f(x) \succeq m I$, or $f''(x)\geq m$ for the scalar case.