[Math] Difference: Newton’s method, Newton-Rhapson method, Gauss Newton-method.

nonlinear optimizationnonlinear system

I would appreciate some clarification w.r.t. algorithms for solving nonlinear systems of equations.

1 – I don't understand the difference between Newton's method and Newton-Rhapson method. In [1], Newton's method is defined using the hessian, but Newton-Rhapson does not. However but I'm afraid they are actually the same thing, since I implemented both and the results were the same across different iterations.

2 – From my understanding, the Gauss Newton-method is a particular case of Newton's method to minimize least squared residuals. Is this correct?

Thanks in advance!

Peter

[1] – www2.imm.dtu.dk/pubdb/views/edoc_download.php/3215/pdf

Best Answer

Newton and Newton-Raphson are just different names for the same method. Sometimes Newton-Raphson is prefered for the scalar/univariate case.

Standard Newton for a vector valued function $F$ (no. equations = no. variables) determines the update step $s=x_+-x$ by solving $F'(x)s=-F(x)$.

Gauß-Newton applies to over-determined systems (no. equations > no. variables) and minimizes the error in $\|F'(x)s+F(x)\|_2$ for the update step. For a quadratic system this reduces to standard Newton.

One could also solve an overdetermined system by minimizing the quadratic error $\|F(x)\|_2$. To determine the zeros of the gradient by Newton's method would here require second derivatives of $F$ which are not needed for Gauß-Newton.

Related Solutions

[Math] Solving a set of 3 nonlinear equations with constraints

In lack of other information about the system behavior, I'd go for minpack. It has hybrid Powell's method for non-linear equations and Levenberg-Marquardt algorithm for minimization problems, both with explicit Jacobian or its numerical evaluation. There is comprehensive documentation referenced at the wikipedia page (original docs to minpack).

There is a c++ port in eigen's non-linear optimization module. The same algorithms are used by Matlab's fsolve and scipy.optimization.fsolve.

[Math] Gauss-Newton Method not converging for the function

$$ f(x) = f(x_0)+\frac{\partial f(x_0)}{\partial x}(x-x_0) + \frac 12(x-x_0)^{\dagger}\frac{\partial^2f(x_0)}{\partial^2x}(x-x_0) + O(|x-x_0|^3) $$

deriving again we have

$$ \nabla f(x)\approx\nabla f(x_0) + \frac{\partial^2f(x_0)}{\partial^2x}(x-x_0)\approx 0 $$

because $f(x) = g_1(x)^2+g_2(x)^2+g_3(x)^2$ so we get

$$ x_k = x_{k-1}-H_{k-1}^{-1}\nabla f(x_{k-1}) $$

with

$$ \nabla f(x) = \frac{\partial f(x_0)}{\partial x} = J\\ H = \frac{\partial^2f(x)}{\partial^2x}\approx J^{\dagger}\cdot J $$

NOTE

Calling $G(x) = \{g_1(x),g_2(x),\cdots,g_m(x)\}^{\dagger}$ and $f(x) = \frac 12G^{\dagger}(x)\cdot G(x)$ we have

$$ \nabla f(x) = J(x)^{\dagger}\cdot G(x) $$

and

$$ \nabla^2 f(x) = H(x) = \nabla J(x)^{\dagger}\cdot G(x) + J(x)^{\dagger}\cdot \nabla G(x) = J(x)^{\dagger}\cdot J(x) +\sum_kg_k(x)\nabla^2 g_k(x) $$

So we can evaluate the approximation

$$ H = \frac{\partial^2f(x)}{\partial^2x}\approx J(x)^{\dagger}\cdot J(x) $$

Attached a MATHEMATICA script implementing the Gauss-Newton iterative procedure

X = {x, y, z};
f[x_, y_, z_] := {{2 x - y z - 1}, {1 - x + y - Exp[x - z]}, {-x - 2 y + 3 z}}
g[x_, y_, z_] := D[f[x, y, z], {X}]
x0 = RandomReal[{-2, 2}];
y0 = RandomReal[{-2, 2}];
z0 = RandomReal[{-2, 2}];
X0 = {x0, y0, z0};
f0 = Flatten[f[x0, y0, z0], 1];
n = 20;
error = 0.0001;
path = {{0, x0, y0, z0, f0.f0}};

For[k = 1, k <= n, k++,
    Jf = Flatten[Transpose[g[x, y, z] /. Thread[X -> X0]], 1];
    JR = Transpose[Transpose[f[x, y, z]].Jf] /. Thread[X -> X0];
    Hf = Transpose[Jf].Jf;
    p = LinearSolve[Hf, -JR];
    If[Max[Abs[p]] < error, Break[]];
    X0 = Flatten[X0 + p];
    f0 = Flatten[f[x, y, x], 1] /. Thread[X -> X0];
    AppendTo[path, {k, x, y, z, f0.f0} /. Thread[X -> X0]]
]

with a typical outcome

$$ \left[ \begin{array}{ccccc} k & x & y & z & f^{\dagger}\cdot f\\ 0 & -0.18083 & -0.360527 & -1.54181 & 27.0262 \\ 1 & 0.419264 & -0.249572 & -0.0266268 & 2.23994 \\ 2 & 0.441999 & 0.376084 & 0.398056 & 0.101378 \\ 3 & 0.693323 & 0.692737 & 0.692933 & 0.00877097 \\ 4 & 0.846417 & 0.846417 & 0.846417 & 0.000556373 \\ 5 & 0.923209 & 0.923209 & 0.923209 & 0.0000347735 \\ 6 & 0.961604 & 0.961604 & 0.961604 & \text{2.1733}\cdot 10^{-6} \\ 7 & 0.980802 & 0.980802 & 0.980802 & \text{1.3583}\cdot 10^{-7} \\ 8 & 0.990401 & 0.990401 & 0.990401 & \text{8.4896}\cdot 10^{-9} \\ 9 & 0.995201 & 0.995201 & 0.995201 & \text{5.3060}\cdot 10^{-10} \\ 10 & 0.9976 & 0.9976 & 0.9976 & \text{3.3162}\cdot 10^{-11} \\ 11 & 0.9988 & 0.9988 & 0.9988 & \text{2.0726}\cdot 10^{-12} \\ 12 & 0.9994 & 0.9994 & 0.9994 & \text{1.2954}\cdot 10^{-13} \\ 13 & 0.9997 & 0.9997 & 0.9997 & \text{8.0963}\cdot 10^{-15} \\ 14 & 0.99985 & 0.99985 & 0.99985 & \text{5.0603}\cdot 10^{-16} \\ \end{array} \right] $$

Attached a script in octave

function [X0,k] = GaussNewton(x0,y0,z0,n,error)
  format long   
  %n     = 20;
  %error = 0.0001;
  %x0    = 0;
  %y0    = 0;
  %z0    = 0;

  X0 = [x0;y0;z0];

  for k = 1:n
    x0 = X0(1);
    y0 = X0(2);
    z0 = X0(3);
    Jf = g(x0,y0,z0);
    JR = (f(x0,y0,z0)'*Jf)';
    Hf = Jf'*Jf;
    p  =-inv(Hf)*JR;
    if max(abs(p)) < error
      break;
    end
    X0 = X0 + p;
  end

end

function fx = f(x,y,z)
  fx = [2.*x-y*z-1.;1.-x+y-exp(x-z);-x-2.*y+3.*z];
end

function dfx = g(x,y,z)
  dfx = [2.,-z,-y;-exp(x-z)-1.,1.,exp(x-z);-1.,-2.,3.];
end

Best Answer

Related Solutions

[Math] Solving a set of 3 nonlinear equations with constraints

[Math] Gauss-Newton Method not converging for the function

Related Question