[Math] Why do we transform constrained optimization problems to unconstrained ones

optimization

In methods like Lagrange multipliers or augmented Lagrangian methods we transform a constrained optimization problem into an unconstrained one and then solve it.

For example in Lagrange multipliers we might have:

maximize $f(x, y)$

subject to $g(x, y) = c$

so we introduce:

$\Lambda(x, y, \lambda) = f(x, y) + \lambda \cdot (g(x, y) – c) $

and then solve

$\nabla_{x,y,\lambda}\Lambda(x,y,\lambda) = 0$

which is an unconstrained problem.

My question is: Why do we prefer to have an unconstrained version of the problem? How does it make the problem easier to solve?

Best Answer

If you remember back to Calculus I, when you first learned about optimization, what was the procedure.

1) Compute the derivative

2) Find points where the derivative is 0 (critical points).

3) Evaluate the function at these points and the endpoints of the region.

In most cases (continuously differentiable functions) this process was guaranteed to work, meaning one of those points was the minimum and one was the maximum. In this case checking the endpoints was the way of dealing with the fact that the optimization problem was constrained.

With higher dimensional functions and more complex boundaries, this problem becomes harder. Generally speaking, we still need to identify points satisfying first order conditions inside the region, and points satisfying modified (see KKT conditions) first order conditions on the boundary of the region.