[Math] How to formulate constrained optimization problems

mathematical modelingoptimization

For a constrained optimization problem, there can be different formulations.

For example, consider the problem with the following formulation:

$$\min_{x \in X \subseteq \mathbb{R}^n} \, f(x), $$
subject to $g(x) \leq 0, h(x)=0 $.

One can move part of those
(in)equality constraints into the
set $X$, or shrink $X$ by moving part of it to
the (in)equality constraints.
One can also exchange between the
equality constraints and the
inequality constraints. For example,
$h(x)=0$ can be replaced by
$h(x)\leq 0, -h(x) \leq 0$.

I was wondering if there are some guidelines on how to formulate a constrained optimization problem into different forms for various purposes, such as to better reveal its structure, to make it easier to solve, to make it become less ill-posed in some sense, …

Are there some discussions about this in some books, paper, links, …?

Thanks and regards!

Best Answer

Typically you'll use the set $X$ to represent black-box constraints, e.g., constraints for which you don't have an analytical representation. They could consist in the output of a computer code that returns True if the constraints are satisfied and False otherwise. In general, if you have analytical descriptions of the constraints, it is to your advantage to use them. There is research on mixed black-box optimization, where the problem has a mixture of black-box constraints and explicit constraints, but you wouldn't get the most out of your problem in terms of efficiency if you misclassified constraints as black-box constraints.

As to the transformation of and equality into two inequalities, it will cause most algorithms for smooth optimization to break down (assuming $h$ is smooth). It's easy to see why: most methods will aim to satisfy the KKT conditions (first-order optimality). However the KKT conditions are necessary for optimality IF a constraint qualification is satisfied. A constraint qualification is a certificate that the analytical expression of the feasible set is, in some sense, not redundant in the description of its geometry.

Consider for instance the constraints $x^3 \leq 1$ and $x^3 \geq 1$. We clearly see some redundancy here. Why not just say $x^3 = 1$? Why not just say $x=1$???

The redundancy manifests itself in the constraint qualification. The most widely used constraint qualification condition is the linear independence constraint qualification condition (LICQ). It requires that the gradients of all constraints that are satisfied as an equality at a solution be linearly independent. This concerns of course all equality constraints but also all inequalities that are "active" (i.e. $g_i(x) = 0$ in your notation). Since $h$ was an equality constraints, we'll necessarily have $h(x) = 0$ at a solution, but the gradients of the functions $h$ and $-h$ cannot be linearly independent. Try solving a problem with $x^3 \leq 1$ and $x^3 \geq 1$ on one of the NEOS solvers and you'll observe trouble: http://neos-server.org/neos

You'll run into the same kind of trouble if one of the active constraint gradients vanishes at a solution. Consider for instance the constraint $x^2 = 0$. When problems are automatically generated by some procedure, it's very hard to check for such redundancy, but when you model a problem by hand from scratch, it's important to look out for it.

Coming back to the first topic, there is a (theoretical) advantage to describing any feasible set as just $X \subseteq \mathbb{R}^n$. It is that the first-order optimality conditions are simply that $-\nabla f(x)$ must lie in the normal cone to $X$ at $x$. And this doesn't depend on any constraint qualification because this is a statement that only concerns the geometry of the feasible set, not its analytical description.

For more information, take a look at the book by Bazaraa, Sherali and Shetty, or "Numerical Optimization" by Nocedal and Wright (Springer).

Best Answer

Related Solutions

[Math] Affine Functions as Equality Constraints in Convex Optimization Problems

Converting a supermodular optimization problem to submodular optimization

Related Question