f = 100*((Y-X^2)^2+(1-X))^2;
That is degree 8 in X.
g = subs(f, [X,Y], [x(i)+S(1)*h,y(i)+h*S(2)]);
That substitutes h linearly in X, so the degree of h will be the same as the maximum of the degree of X and Y, so g will end up degree 8 in h.
differentiate degree 8 to get degree 7. Which happens to be factorable into degree 4 and degree 3.
Degree 7 so you get 7 solutions, some of which are complex valued.
x(i+1) = I(1)+h(1)*S(1);
y(i+1) = I(2)+h(2)*S(2);
The solutions are ordered, but they are ordered according to internal logic that is not documented anywhere. They are certainly not sorted by anything having to do with x and y. The branches associated with the order is going to change as the values change, and at some point it is going to happen to select complex branches according to the internal logic.
You need to think explicitly about how you are going to deal with those multiple roots.
I would suggest that you should be using different symbolic h for x and y directions, calculating the gradient, solving for the two different h variables. If you do this, you will get 8 possible values each for the two different variables. You can then use an algorithm to choose particular ones, such as eliminating the ones with complex solutions, and choosing the solution pair that has the highest sum-of-squared values.
Best Answer