I have been using Matlab in my design optimization course, and I wanted to ask a bit about the output.firstorderopt values generated by fmincon.
Obviously, this needs to be a small value for a solution, and I see where this value corresponds to the infinity norm of the vector sum that represents the gradient of the Lagrangian function. This corresponds to the "first" Karush Kuhn Tucker condition here: https://www.mathworks.com/help/optim/ug/first-order-optimality-measure.html (eqn 2).
I had a student ask me a question about this, and after a little exploration, I found for a simple test problem I can calculate the gradient of the Lagrangian at the solution to a constrained minimization problem and it is different from the value in output.firstorderopt.
Here is the simple example problem:
minimize
subject to
I set this up to solve using SQP via fmincon, starting from the initial design x0 = [2; 4]. Here is the objective function script:
function [f, grad_f] = class24_fun(x)% in-class example objective function
f = x(1)^4 + x(2)^2 - x(1)^2 * x(2);if nargout > 1 grad_f = [4*x(1)^3 - 2*x(1)*x(2); 2*x(2) - x(1)^2];endend
Here is the nonlinear constraint script:
function [g, h, grad_g, grad_h] = ... class24_con(x)% in-class example nonlinear constraints
% inequality constraints
g(1) = 1 - 2 * x(1) * x(2) / 3;g(2) = (3 * x(1)^2 - 4 * x(2)) / 3 + 1;% equality constraints
h = []; % none in this problem
% compute gradients
if nargout > 2 % gradients of inequality; each column in
% array is one gradient vector
grad_g = [-2 * x(2)/3, 2 * x(1); -2 * x(1) / 3, -4/3]; % equality
grad_h = []; % no equality constraints
endend
Here is how I solved the problem and then computed my own version of the first-order optimality measure. I left off several semi-colons to show the results.
>> x0 = [2; 4];>> options = optimoptions('fmincon','Algorithm','sqp');>> [xstar,fstar,exitflag,output,lambda,grad,hessian] = fmincon(@class24_fun,x0,...[],[],[],[],[],[],@class24_con,options)Local minimum found that satisfies the constraints.Optimization completed because the objective function is non-decreasing in feasible directions, to within the value of the optimality tolerance,and constraints are satisfied to within the value of the constraint tolerance.<stopping criteria details>xstar = 1.0000 1.5000fstar = 1.7500exitflag = 1output = struct with fields: iterations: 6 funcCount: 26 algorithm: 'sqp' message: '↵Local minimum found that satisfies the constraints.↵↵Optimization completed because the objective function is non-decreasing in ↵feasible directions, to within the value of the optimality tolerance,↵and constraints are satisfied to within the value of the constraint tolerance.↵↵<stopping criteria details>↵↵Optimization completed: The relative first-order optimality measure, 2.244896e-08,↵is less than options.OptimalityTolerance = 1.000000e-06, and the relative maximum constraint↵violation, 1.443290e-15, is less than options.ConstraintTolerance = 1.000000e-06.↵↵' constrviolation: 1.4433e-15 stepsize: 7.3233e-08 lssteplength: 1 firstorderopt: 4.4898e-08lambda = struct with fields: eqlin: [0×1 double] eqnonlin: [0×1 double] ineqlin: [0×1 double] ineqnonlin: [2×1 double] lower: [2×1 double] upper: [2×1 double]grad = 1.0000 2.0000hessian = 10.0144 -3.2948 -3.2948 2.1570>> [fstar,grad_fstar] = class24_fun(xstar)fstar = 1.7500grad_fstar = 1.0000 2.0000>> [gstar,hstar,grad_gstar,grad_hstar] = class24_con(xstar)gstar = 1.0e-14 * 0.0777 0.1443hstar = []grad_gstar = -1.0000 2.0000 -0.6667 -1.3333grad_hstar = []>> grad_L = grad_fstar + lambda.ineqnonlin(1) * grad_gstar(:,1) + lambda.ineqnonlin(2) * grad_gstar(:,2)grad_L = 1.0e-07 * -0.2588 -0.1047>> myfirstorderopt = norm(grad_L,Inf)myfirstorderopt = 2.5883e-08 >> output.firstorderoptans = 4.4898e-08
Both numbers are very small, and I know that the xstar values are the minimum of the function, but is the difference between myfirstorderopt and the output.firstorderopt simply a result of truncation errors, or am I missing some other feature of how fmincon is computing this first order optimality measure?
Best Answer