MATLAB: Clarification about first order optimality

firstorderoptfminconOptimization Toolbox

I have been using Matlab in my design optimization course, and I wanted to ask a bit about the output.firstorderopt values generated by fmincon.

Obviously, this needs to be a small value for a solution, and I see where this value corresponds to the infinity norm of the vector sum that represents the gradient of the Lagrangian function. This corresponds to the "first" Karush Kuhn Tucker condition here: https://www.mathworks.com/help/optim/ug/first-order-optimality-measure.html (eqn 2).

I had a student ask me a question about this, and after a little exploration, I found for a simple test problem I can calculate the gradient of the Lagrangian at the solution to a constrained minimization problem and it is different from the value in output.firstorderopt.

Here is the simple example problem:

minimize

subject to g_1(x) = 1 - 2 * x(1) * x(2) / 3 <= 0

I set this up to solve using SQP via fmincon, starting from the initial design x0 = [2; 4]. Here is the objective function script:

function [f, grad_f] = class24_fun(x)
% in-class example objective function 
f = x(1)^4 + x(2)^2 - x(1)^2 * x(2);
if nargout > 1
    grad_f = [4*x(1)^3 - 2*x(1)*x(2);
        2*x(2) - x(1)^2];
end
end

Here is the nonlinear constraint script:

function [g, h, grad_g, grad_h] = ...
    class24_con(x)
% in-class example nonlinear constraints
% inequality constraints
g(1) = 1 - 2 * x(1) * x(2) / 3;
g(2) = (3 * x(1)^2 - 4 * x(2)) / 3 + 1;
% equality constraints
h = []; % none in this problem
% compute gradients
if nargout > 2
    % gradients of inequality; each column in 
    % array is one gradient vector
    grad_g = [-2 * x(2)/3, 2 * x(1);
        -2 * x(1) / 3, -4/3];
    % equality 
    grad_h = []; % no equality constraints
end
end

Here is how I solved the problem and then computed my own version of the first-order optimality measure. I left off several semi-colons to show the results.

>> x0 = [2; 4];
>> options = optimoptions('fmincon','Algorithm','sqp');
>> [xstar,fstar,exitflag,output,lambda,grad,hessian] = fmincon(@class24_fun,x0,...
[],[],[],[],[],[],@class24_con,options)
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in 
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
<stopping criteria details>
xstar =
    1.0000
    1.5000
fstar =
    1.7500
exitflag =
     1
output = 
  struct with fields:
         iterations: 6
          funcCount: 26
          algorithm: 'sqp'
            message: '↵Local minimum found that satisfies the constraints.↵↵Optimization completed because the objective function is non-decreasing in ↵feasible directions, to within the value of the optimality tolerance,↵and constraints are satisfied to within the value of the constraint tolerance.↵↵<stopping criteria details>↵↵Optimization completed: The relative first-order optimality measure, 2.244896e-08,↵is less than options.OptimalityTolerance = 1.000000e-06, and the relative maximum constraint↵violation, 1.443290e-15, is less than options.ConstraintTolerance = 1.000000e-06.↵↵'
    constrviolation: 1.4433e-15
           stepsize: 7.3233e-08
       lssteplength: 1
      firstorderopt: 4.4898e-08
lambda = 
  struct with fields:
         eqlin: [0×1 double]
      eqnonlin: [0×1 double]
       ineqlin: [0×1 double]
    ineqnonlin: [2×1 double]
         lower: [2×1 double]
         upper: [2×1 double]
grad =
    1.0000
    2.0000
hessian =
   10.0144   -3.2948
   -3.2948    2.1570
>> [fstar,grad_fstar] = class24_fun(xstar)
fstar =
    1.7500
grad_fstar =
    1.0000
    2.0000
>> [gstar,hstar,grad_gstar,grad_hstar] = class24_con(xstar)
gstar =
   1.0e-14 *
    0.0777    0.1443
hstar =
     []
grad_gstar =
   -1.0000    2.0000
   -0.6667   -1.3333
grad_hstar =
     []
>> grad_L = grad_fstar + lambda.ineqnonlin(1) * grad_gstar(:,1) + lambda.ineqnonlin(2) * grad_gstar(:,2)
grad_L =
   1.0e-07 *
   -0.2588
   -0.1047
>> myfirstorderopt = norm(grad_L,Inf)
myfirstorderopt =
   2.5883e-08
   
>> output.firstorderopt
ans =
   4.4898e-08

Both numbers are very small, and I know that the xstar values are the minimum of the function, but is the difference between myfirstorderopt and the output.firstorderopt simply a result of truncation errors, or am I missing some other feature of how fmincon is computing this first order optimality measure?

Best Answer

It is because you have not invoked the options SpecifyObjectiveGradient=true and SpecifyConstraintGradient=true in your call to fmincon. Your calculation of firstorderopt is based on analytical gradient calculations, while fmincon is using finite difference approximations. We see much better agreement when these options are activated:

options = optimoptions('fmincon','Algorithm','sqp','Display','none',...
            'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true);
[xstar,fstar,exitflag,output,lambda,grad,hessian] = ...
    fmincon(@class24_fun,[2; 4],[],[],[],[],[],[],@class24_con,options);
[fstar,grad_fstar] = class24_fun(xstar);
[gstar,hstar,grad_gstar,grad_hstar] = class24_con(xstar);
grad_L = grad_fstar + grad_gstar*lambda.ineqnonlin;
myFOOM = norm(grad_L,inf)
myFOOM = 1.0004e-08
tmwFOOM=output.firstorderopt
tmwFOOM = 1.0004e-08

There are some residual differences in the two calculations, which I think can be attributed to floating point discrepancies:

abs(myFOOM-tmwFOOM)
ans = 7.4015e-17
function [f, grad_f] = class24_fun(x)
% in-class example objective function 
    f = x(1)^4 + x(2)^2 - x(1)^2 * x(2);
    if nargout > 1
        grad_f = [4*x(1)^3 - 2*x(1)*x(2);
            2*x(2) - x(1)^2];
    end
end
function [g, h, grad_g, grad_h] = ...
    class24_con(x)
% in-class example nonlinear constraints
    % inequality constraints
    g(1) = 1 - 2 * x(1) * x(2) / 3;
    g(2) = (3 * x(1)^2 - 4 * x(2)) / 3 + 1;
    % equality constraints
    h = []; % none in this problem
    % compute gradients
    if nargout > 2
        % gradients of inequality; each column in 
        % array is one gradient vector
        grad_g = [-2 * x(2)/3, 2 * x(1);
            -2 * x(1) / 3, -4/3];
        % equality 
        grad_h = []; % no equality constraints
    end
end

Related Solutions

MATLAB: Problems with fmincon, how to solve

Your problem is convex, so you can use the CVX package if you want. Also, the solution should be the global minimum; not a local minimum. To use fmincon, do the following:

Equality/nonEquality constraint:

function [c,ceq]=mycon(x)
c(1) = x(1)^2 + x(2)^2 - 4*x(1) - 3*x(2) + 5;
c(2) = -x(1)^2 - x(2)^2 + 4*x(1) + 3*x(2) - 6; % notice the fixed typo
ceq = 2*x(1)^2 + 2*x(2)^2 - 3*x(1) - 3*x(2) - 2; 
end

Construct the problem and call fmincon:

clc
clear all
Aeq=[]; 
beq=[]; 
A=[]; 
b=[]; 
% if you are not sure that vector x is within lb and ub do not restrict it
lb = []; %lb=[-3,-3]; 
ub = []; %ub=[3,3]; % 
x0=[4,4];
myObj = @(x) x(1)^2 + x(2)^2 - 5*x(1) - 6*x(2) + 15;
nonlcon = @mycon;
options = optimoptions('fmincon','Algorithm','sqp','Display','iter-detailed','MaxFunctionEvaluations',100000, 'MaxIterations',2000, 'OptimalityTolerance' ,1e-10);
[x,cost] = fmincon(myObj,x0,A,b,Aeq,beq,lb,ub,nonlcon)

Output (global minimum):


Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in 
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
<stopping criteria details>
x =
    1.6450    1.9007
cost =
    1.6896

MATLAB: What does an exitflag of greater than 1 indicate when using the FMINCON function in the Optimization Toolbox

As of MATLAB 7.8 (R2009a), the exit flags for FMINCON and other functions in the Optimization Toolbox have been modified to explain why the function has terminated, and whether or not that termination was successful.

For all versions previous to MATLAB 7.8 (R2009a), please see below:

When FMINCON returns an exitflag of greater than 1, the results returned may not be the true optimal solution, in the sense that the gradient may not meet the specified or default tolerances. The returned results are in the feasible region, but the optimization algorithm has stalled at a certain point and cannot continue.

An exitflag of greater than 1 should not be immediately interpreted as a failure in the optimization. You should inspect the "firstorderopt" field of the output structure returned as the fourth output argument by FMINCON, to see how close the solution is to the true optimizer. Values of output.firstorderopt close to the tolerance "TolFun", indicate converged parameter vectors closer to the true optimizer. Larger values of output.firstorderopt call for more caution in using the converged parameter vector.

First-order optimality in FMINCON is defined as the violation of the Karush-Kuhn-Tucker (KKT) condition, stated as follows:

 grad_f = sum lambda(i)*a(i)
 lambda_ineq.*c_ineq

where "grad_f" is the gradient of the objective function at the solution, "lambda" are the Lagrange multipliers, "a" are the gradients of the constraints that are active at the solution, "c_ineq" are the inequality constraints (bounds, linear, and nonlinear), and "lambda_ineq" are the corresponding Lagrange multipliers. The "firstorderopt" output is a measure of the violation of the above condition.

If the "firstorderopt" output of your optimization is large, you should try using different initial values, or check whether your optimization has any scaling problems, to see if you can improve the optimization to obtain a better "firstorderopt" output.

Best Answer

Related Solutions

MATLAB: Problems with fmincon, how to solve

MATLAB: What does an exitflag of greater than 1 indicate when using the FMINCON function in the Optimization Toolbox

Related Question