I need to solve the optimization problem $$ \min_{x\in \mathbb{R}^{3}}f(x) $$ where the function $f$ is defined as follows:
$$ f(x_{1}, x_{2}, x_{3}) = \frac{1}{2}\left[\left(2x_{1}-x_{2}x_{3}-1 \right)^{2}+\left(1-x_{1}+x_{2}-e^{x_{1}-x_{3}}\right)^{2}+\left( -x_{1}-2x_{2}+3x_{3}\right)^{2} \right]$$
using Matlab code for Gauss-Newton's Method.
I already know that the optimal point is $\hat{x}=(1,1,1)$, and my instructions by my professor were to alter my code for Newton's Method by using the Jacobian approximation of the Hessian (i.e, using $\displaystyle H(j,k)\approx 2 \sum_{i=1}^{3}\left[ J(i,j)*J(i,k)\right]$).
After doing this, and running my code for the starting points $(0,0,0)$, $(1.5,0.5,0.5)$, $(2,2,1)$, and $(-1.5,0.5,-0.5)$, I find that the only points it even appears to converge for after 100 iterations are $(0,0,0)$ (appears to converge after 11 iterations, but the algorithm doesn't stop running after that like it did for Newton's Method where it also converged early) and for $(-1.5,0.5,-0.5)$ (appears to converge after 19 iterations, but again, the algorithm doesn't stop running).
These points are not necessarily closer to the optimal point of $(1,1,1)$ than the other starting points. So then, I went to look at the condition index of $J^{T}*J$, because I've read that a high condition index of this matrix could prevent Gauss-Newton from converging.
I calculated the Jacobian Matrix $J$ and its transpose $J^{T}$ at the optimal point of $\hat{x}=(1,1,1)$ and obtained $J = \begin{pmatrix} 9 & -2 & -7 \\ -2 & 6 & -4 \\ -7 & -4 & 11 \end{pmatrix}$, $J^{T} = \begin{pmatrix}9 & -2 & -7 \\ -2 & 6 & -4 \\ -7 & -4 & 11 \end{pmatrix}$ (So, $J$ is a symmetric matrix).
Therefore, $J^{T}*J = \begin{pmatrix} 134 & -2 & -132 \\ -2 & 56 & -54 \\ -132 & -54 & 186 \end{pmatrix}$.
Using Matlab, I calculated the three eigenvalues of this matrix to be $\lambda_{1} = -0.0000$ (I'm assuming this means it is something negative really, really close to zero?), $\lambda_{2} = 74.6686$, $\lambda_{3} = 301.3314$.
First, I calculated the condition index manually by taking the ratio of $\displaystyle \text{max eigenvalue}/\text{min eigenvalue} = \frac{301.3314}{-0.0000} \approx \frac{301.3314}{0}$, which is infinite.
Then, according to the matlab function cond(), the condition index of $J^{T}*J$ was given to be $6.4120e+16$, which is large, but not infinite.
Is this why Gauss-Newton is not converging for this problem? Is there another reason? Or is it suppose to converge, and am I just doing something wrong?
I thank you ahead of time for your time and patience, and very much look forward to learning how to figure out this problem.
Best Answer
As
$$ f(x) = f(x_0)+\frac{\partial f(x_0)}{\partial x}(x-x_0) + \frac 12(x-x_0)^{\dagger}\frac{\partial^2f(x_0)}{\partial^2x}(x-x_0) + O(|x-x_0|^3) $$
deriving again we have
$$ \nabla f(x)\approx\nabla f(x_0) + \frac{\partial^2f(x_0)}{\partial^2x}(x-x_0)\approx 0 $$
because $f(x) = g_1(x)^2+g_2(x)^2+g_3(x)^2$ so we get
$$ x_k = x_{k-1}-H_{k-1}^{-1}\nabla f(x_{k-1}) $$
with
$$ \nabla f(x) = \frac{\partial f(x_0)}{\partial x} = J\\ H = \frac{\partial^2f(x)}{\partial^2x}\approx J^{\dagger}\cdot J $$
NOTE
Calling $G(x) = \{g_1(x),g_2(x),\cdots,g_m(x)\}^{\dagger}$ and $f(x) = \frac 12G^{\dagger}(x)\cdot G(x)$ we have
$$ \nabla f(x) = J(x)^{\dagger}\cdot G(x) $$
and
$$ \nabla^2 f(x) = H(x) = \nabla J(x)^{\dagger}\cdot G(x) + J(x)^{\dagger}\cdot \nabla G(x) = J(x)^{\dagger}\cdot J(x) +\sum_kg_k(x)\nabla^2 g_k(x) $$
So we can evaluate the approximation
$$ H = \frac{\partial^2f(x)}{\partial^2x}\approx J(x)^{\dagger}\cdot J(x) $$
Attached a MATHEMATICA script implementing the Gauss-Newton iterative procedure
with a typical outcome
$$ \left[ \begin{array}{ccccc} k & x & y & z & f^{\dagger}\cdot f\\ 0 & -0.18083 & -0.360527 & -1.54181 & 27.0262 \\ 1 & 0.419264 & -0.249572 & -0.0266268 & 2.23994 \\ 2 & 0.441999 & 0.376084 & 0.398056 & 0.101378 \\ 3 & 0.693323 & 0.692737 & 0.692933 & 0.00877097 \\ 4 & 0.846417 & 0.846417 & 0.846417 & 0.000556373 \\ 5 & 0.923209 & 0.923209 & 0.923209 & 0.0000347735 \\ 6 & 0.961604 & 0.961604 & 0.961604 & \text{2.1733}\cdot 10^{-6} \\ 7 & 0.980802 & 0.980802 & 0.980802 & \text{1.3583}\cdot 10^{-7} \\ 8 & 0.990401 & 0.990401 & 0.990401 & \text{8.4896}\cdot 10^{-9} \\ 9 & 0.995201 & 0.995201 & 0.995201 & \text{5.3060}\cdot 10^{-10} \\ 10 & 0.9976 & 0.9976 & 0.9976 & \text{3.3162}\cdot 10^{-11} \\ 11 & 0.9988 & 0.9988 & 0.9988 & \text{2.0726}\cdot 10^{-12} \\ 12 & 0.9994 & 0.9994 & 0.9994 & \text{1.2954}\cdot 10^{-13} \\ 13 & 0.9997 & 0.9997 & 0.9997 & \text{8.0963}\cdot 10^{-15} \\ 14 & 0.99985 & 0.99985 & 0.99985 & \text{5.0603}\cdot 10^{-16} \\ \end{array} \right] $$
Attached a script in octave