[Math] Difference between Fritz John and Karush-Kuhn-Tucker conditions

convex optimizationoptimization

I am a student of Computer Science and currently learning about optimization. We've been introduced to the Fritz-John and Karush-Kuhn-Tucker conditions for convex optimizations.

I think I understand what they mean, in the sense that these conditions have to be fulfilled in order for a point to be an optimum.

However, I don't really understand the difference between both and my lecture notes are not very good at making a clear point of this difference. In fact, I am allergic to mathematical notation, so I would appreciate a concise and short answer in plain English, maybe with an example.

Thank you!

Best Answer

Both sets of conditions are necessary conditions for a point to be optimal, but they're not quite the same mathematically. The KKT conditions are more restrictive and thus shrink the class of points (from those satisfying the Fritz John conditions) that must be tested for optimality. The additional restriction with KKT is that the Lagrange multiplier on the gradient of the objective function cannot be zero. One of the most important resulting differences is that KKT points for linear programs must be optimal, whereas Fritz John points for linear programs don't have to be.

The section on KKT conditions in Bazarra, Sherali, and Shetty's Nonlinear Programming: Theory and Algorithms (second edition) has a nice discussion of the issues. There are several good examples, especially one in which a Fritz John point for a linear program is shown not to be optimal. That can't happen with KKT points for linear programs.

Related Question