[Math] Determining active constraints in KKT

karush kuhn tuckernonlinear optimizationoptimization

Suppose there is a constrained optimization problem having inequality constraints. We can solve it using Karush-Kuhn-Tucker conditions. My question is how do we determine which constraints are active and which are inactive? I read it in a KKT post1, post2 that we need to try all possible combinations of active and inactive constraints.

So after reading about it I came upon the following summary for finding a local optimum satisfying some inequality constraints. If we have $n$ inequality constraints we assume $k$ of those constraints to be active and solve using KKT necessary conditions. We do this for $k$ from $0, 1, \ldots, n$. So total number of possibilities are $2^n$. Suppose among these $m$ necessary conditions are satisfied. We use KKT sufficient conditions on those $m$ cases. If $m_1$ among these are satisfied then these lead us to the local optimal solutions. Is this right?

Is this the way to theoretically approach this problem, assuming we don't need a numerical techniques to solve it?

Just a stupid query by can this problem be possibly solved by assuming that all $n$ constraints are active, and then we get values of Langrange multipliers $\lambda_i, i\in \{0,1,\ldots,n\}$. Whichever $\lambda_i$s are negative are the inactive constraints?

Best Answer

Technically, knowing all of the active constraints isn't enough. For the nonlinear optimization problem $$ \min\limits_{x\in \mathbb{R}^m} \{f(x) : g(x) = 0, h(x)\geq 0\} $$ we have the first-order optimality conditions \begin{align*} \nabla f(x) + g^\prime(x)^Ty - h^\prime(x)^Tz =& 0\\ g(x) =& 0\\ h(x) \geq& 0\\ z \geq& 0\\ h(x) \circ z =& 0 \end{align*} where $\circ$ denotes the pointwise product and $g^\prime(x)$ and $h^\prime(x)$ are the Jacobians (total derivatives). There also needs to be some kind of constraint qualification satisfied. Anyway, you can find results like this in Theorem 12.1 from Nocedal and Wright's Numerical Optimization. Anyway, the last requirement, $h(x)\circ z=0$ is the complementary slackness condition and if we break it down it states $$ h_i(x) z_i = 0 $$ for $i=1,\dots,m$. If we know that a constraint is inactive at optimality, then $h_i(x) > 0$ and that forces $z_i=0$. Likewise, if the constraint is active, then we have $h_i(x)=0$ and $z_i\geq 0$. Anyway, let's say that we knew all of the active constraints, we could decompose $h$ and $z$ into \begin{align*} h(x) =& \begin{bmatrix} h_A(x)\\h_I(x)\end{bmatrix}\\ z =& \begin{bmatrix} z_A \\ z_I\end{bmatrix} \end{align*} For all of the inactive constraints, we know that $z_I=0$, so the optimality conditions become \begin{align*} \nabla f(x) + g^\prime(x)^Ty - h_A^\prime(x)^Tz_A =& 0\\ g(x) =& 0\\ h_A(x) =& 0\\ h_I(x) >& 0\\ z_A \geq& 0 \end{align*} Therein lies the rub. We still have to satisfy the inactive constraints, so we can't ignore them. We still also have to constrain the Lagrange multipliers that correspond to the active constraints.

This is why we typically need specialized algorithms to solve inequality constrained algorithms like projection, active set, or interior point methods. They take care of the nonnegativity and complementary slackness constraints in their own ways.