[Math] Optimality conditions – necessary vs sufficient

convex optimizationoptimization

For a convex optimization problem, the first-order necessary condition says that at an optimum, the gradient is equal to zero.

$\triangledown L(w*) = 0 $

The second-order sufficient condition ensures that the optimum is a
minimum (not a maximum or saddle-point) using the Hessian matrix, which is the matrix of second derivatives:

$H(w*) := \dfrac{\partial^2L(w*)}{\partial w \partial w^T}$

is positive semi-definite. The Hessian is also related to the convexity of a function: a twice differentiable function is convex if and only if the Hessian is positive semi-definite at all points.

What is the difference between a necessary and a sufficient condition ? The definition of necessary in this context is clear to me since we need the gradient at the optimum to be equal to zero. What then, is the sufficient condition about ? Is the problem still convex if the sufficient condition is not satisfied ?

Best Answer

A condition is necessary if it has to be true for some other statement to be true. In this case, the gradient has to be zero for the point $w^*$ to be a minimum. A condition is sufficient if that condition being true is enough to conclude that some other condition is true as well. In this case, if a site with gradient $0$ has a positive semi-definite Hessian, it has to be a minimum.

The necessary gradient condition is weakened by the fact that a site with gradient $0$ could also be a saddle point. We could say that it's a sufficient condition for concluding that the site is either an optimum or a saddle, though.

For the Hessian condition, note that it's only sufficient if we know that the site in question has gradient $0$. Knowing that a Hessian is positive semi-definite is a necessary condition for a site to be a minimum, but not a sufficient one: just think about any point on a convex parabola for a counterexample!

Related Question