There are several reasons why your objective function might suddenly increase.
What may be happening is that FMINCON is getting near the solution and is trying to model the function as a quadratic (i.e. the SQP sub-problem) but cannot do so successfully. This means that the quadratic model doesn't fit the objective function well near the solution: perhaps it is not convex, or there is a loss of continuity, or it is trying to meet the constraints but is unable to do so.
There may also be spikes in the computed gradient which make the magnitude of the directional derivative and the 1st order optimality incorrect. That would fool FMINCON into thinking that it wasn't at a stationary (flat) region and that it should continue.
This is also something that can commonly happen if you reduce the tolerances to unnecessarily small values. Basically, at such low tolerances, there is some numerical "fuzziness" near the optimal solution. FMINCON is near the optimum, but because of the very low exit tolerances it is unable to converge; at the next iteration, FMINCON is getting near the solution and is trying to model the function as a quadratic: this is the SQP iteration, (i.e. the cost function needs to be modelled as a quadratic in the neighborhood of the current iteration); it is trying to approximate the Hessian (i.e. the matrix of second derivatives) as a positive definite matrix but cannot do so successfully. This indicates that the quadratic model doesn't fit the your function well near the solution (i.e. the quasi-Newton algorithm fails). In certain cases, this could lead to the SQP iteration maxing out and the Hessian being modified twice. The message "Hessian modified twice; MaxSQPIter" then appears, indicating the SQP failure.
From this iteration onward the 'search directions' are not be optimal, as the quasi-Newton method failed. This would explain the divergence from the optimal solution.
If the algorithm drifts away from the optimal solution at very tight tolerances, there are a few options. The most obvious workaround is to loosen the tolerances. If you wish to maintain low values for 'TolFun' and 'TolX', however, there may be other things you can do, depending on your specific case.
The round-off errors in the function evaluations when requiring such strict tolerances are tantamount to introducing discontinuities in the objective function causing the gradient to vary greatly: this explains the large value of the directional derivative at the "bad" iteration, causing the next step to be far away from the optimum point. One possibility is therefore to adjust the finite-difefrence perturbation level 'DiffMinChange' in the gradient evaluation.
The last option, if it is applicable to your situation, is to work around the problems with finite-differences altogether and specify the gradient and Hessian analytically.
Best Answer