[Math] Geometric Meaning of KKT Conditions

convex optimizationconvex-analysiskarush kuhn tuckernumerical optimizationoptimization

I am trying to state KKT conditions for an optimization with both inequality and equality constraints in geometric form, so that I get the big picture better.

Is this what it says?

At the optimal solution, the gradient of the function must be a linear combination of the gradients of the constraints?

Best Answer

The KKT conditions encode the following two observations. At the optimal solution,

  1. The gradient of the objective function is perpendicular to the constraint set, and

  2. The constraint is satisfied.

Condition 2. comes from the definition of the optimization problem.

If condition 1. is not satisfied at a point on the constraint surface, you can increase the value of the objective function by walking along the constraint surface in the direction of the projection of the gradient onto the tangent hyperplane to the constraint surface.

Imagine a bead constrained to slide on the constraint surface, and the gradient vector as a force pushing on it. The bead can only be at equilibrium if the force pushing on it is perpendicular to the surface it is constrained on.

Or imagine walking on a path in the mountains that doesn't go to a peak. At the high point the most uphill direction is perpendicular to the path.

With an inequality constraint there are two distinct possibilities.

  • The optimal point could be on the boundary of the inequality constraint feasible set. Then this optimal point also solves the modified optimization problem where the inequality constraint is replaced with an equality constraint. So, in this case we can use the Lagrange multiplier result for equality constrained problems.

  • Or, the optimal point could be in the interior of the inequality constraint feasible set, in which case the constraint is (locally) inactive and might as well not be there.

Complimentary slackness encodes both these possibilities. You have a product of two things that must be zero. If the constraint portion of the multiplicand is zero, then you are on the boundary of the constraint set, whereas if the multiplier portion of the multiplicand is zero then you are not on the boundary and the constraint is inactive.

Dual feasibility (positivity of the inequality multipliers) ensures that if you are on the boundary of an inequality constraint, you could not improve things by moving into the interior. From the previous example, the constraint surface can only exert a "reaction force" on the bead in the inward direction to keep it inside, but cannot apply a reaction force in the outward direction.