Understanding Newton method when optimizing cost function depending on a triangle.

numerical methodsnumerical optimization

I have the following function defined from $\mathbb{R}^9 \to \mathbb{R}$

\begin{equation}
f(x) = f(x_1,x_2,x_3) = \frac{1}{2}(\left\langle n,e_3 \right\rangle – \lVert n \rVert)^2
\end{equation}

Basically it's a function of the vertices of a given triangle and we also have $e_3 = (0,0,1)^T$ and $n = (x_2 – x_1)\times (x_3 – x_1)$. Computing the gradient w.r.t. $x$ leads me to

$$
\nabla f =
\left(\left\langle \frac{n}{\lVert n \rVert} ,e_3 \right\rangle – 1\right)^2
\begin{pmatrix}
(x_2 – x_3) \times n \\
(x_3 – x_1) \times n \\
(x_1 – x_2) \times n
\end{pmatrix}
$$

I'm trying to give a rigorous interpretation of that gradient, if I set that to $0$ I can get the minumum (though there's a an issue with $\lVert n \rVert$ term. I've implemented such gradient in matlab to minimize a function and this is what I'm getting:

enter image description here

The algorithm matlab is using is the Newton method. Not sure if that's visible but basically it seems the triangle has been rotated in order to be parallel to the xy plane.

However if would use gradient descent instead of Newton method, If I'm reading correctly the gradient it seems to me the three vertices would potentially be shrank until the triangle would degenerate to a single point.

Why is the Newton method attempting to rotate instead of warping the triangle?

Best Answer

Why is the Newton method attempting to rotate instead of warping the triangle?

I'm not sure what you mean by "warping the triangle", but...

Clearly your cost function attains its local minima of $0$ at exactly the points $(x_1, x_2, x_3)$ for which the $x$ and $y$ components of $n$ are zero. So any solution necessarily will have all of $x_1, x_2, x_3$ all lying in some horizontal plane $z = C$. The Newton optimizer is simply converging to one of these solutions.

You of course have the issues that

  1. As indicated above, your optimum is (highly) non-unique. You could remedy this by introducing some sort of regularization, e.g. a similarity constraint as suggested by @Christian Blatter in the comments.
  2. Your cost function is not differentiable on the subspace $n=0$. It appears as if the Newton optimizer is currently converging to a valid minimum away from this subspace, but you would probably be safer using a globally differentiable cost function. Note that, alternatively, the above regularization suggestion would also remedy this issue by fixing $\| n \| > 0$.

but I believe that first paragraph answers your explicit question.

Related Question