[Math] If Lagrangian multipliers aren’t a sufficient condition, what is sufficient for optimality in constrained problems

lagrange multiplieroptimization

According to Wikipedia, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints. In particular, a typical problem says maximize $f(x, y)$ subject to $g(x, y) = c$ and follows an equation

$\Lambda (x,y,\lambda) = f(x,y) + \lambda (g(x,y)-c)$

Here's the part I am interested in:

However, not all stationary points yield a solution of the original problem. Thus, the method of Lagrange multipliers yields a necessary condition for optimality in constrained problems.

My Question:

If Lagrangian multipliers aren't a sufficient condition, what is a sufficient condition for optimality in constrained problems?

Best Answer

Convenient conditions for checking that a certain stationary point is optimal generally require that the objective function is convex. Nonconvex optimization is a much more difficult subject with a lot of special cases for different problems. Two of the many reasons for this are that there can be stationary points that are not local minima and that there can be multiple local minima.

On a simpler level, notice that the Lagrange multipliers condition is a first order condition. You may recall the second order condition from calculus, which checks the sign of the second derivative at the critical point. This condition is sufficient, but not necessary: when the second derivative is positive you have a minimum, when it is negative you have a maximum, and when it is zero you get no information.

This condition generalizes to multivariable problems through the Hessian. Cases:

  • The Hessian is positive definite (all eigenvalues are strictly positive) $\Rightarrow$ minimum.
  • The Hessian is negative definite (all eigenvalues are strictly negative) $\Rightarrow$ maximum.
  • The Hessian is indefinite (there are some strictly positive eigenvalues and some strictly negative eigenvalues) $\Rightarrow$ saddle point. Under normal circumstances this is neither a maximum nor a minimum, because we can follow the eigenvector with a negative eigenvalue to go down and can follow the eigenvector with a positive eigenvalue to go up.
  • The Hessian is positive (resp. negative) semidefinite (all eigenvalues are nonnegative (resp. nonpositive) but some are zero). Here you may have a minimum (resp. maximum) but may also have nothing.

Of course, everything from the previous paragraph assumes you have enough smoothness that all of this makes sense, and there are certainly nonsmooth optimization problems. These problems have their own special issues. The first place to look for dealing with these kinds of problems is convex analysis. My knowledge of convex analysis is quite limited, so I will not try to discuss it here.