I have the following function defined from $\mathbb{R}^9 \to \mathbb{R}$
\begin{equation}
f(x) = f(x_1,x_2,x_3) = \frac{1}{2}(\left\langle n,e_3 \right\rangle – \lVert n \rVert)^2
\end{equation}
Basically it's a function of the vertices of a given triangle and we also have $e_3 = (0,0,1)^T$ and $n = (x_2 – x_1)\times (x_3 – x_1)$. Computing the gradient w.r.t. $x$ leads me to
$$
\nabla f =
\left(\left\langle \frac{n}{\lVert n \rVert} ,e_3 \right\rangle – 1\right)^2
\begin{pmatrix}
(x_2 – x_3) \times n \\
(x_3 – x_1) \times n \\
(x_1 – x_2) \times n
\end{pmatrix}
$$
I'm trying to give a rigorous interpretation of that gradient, if I set that to $0$ I can get the minumum (though there's a an issue with $\lVert n \rVert$ term. I've implemented such gradient in matlab to minimize a function and this is what I'm getting:
The algorithm matlab is using is the Newton method. Not sure if that's visible but basically it seems the triangle has been rotated in order to be parallel to the xy plane.
However if would use gradient descent instead of Newton method, If I'm reading correctly the gradient it seems to me the three vertices would potentially be shrank until the triangle would degenerate to a single point.
Why is the Newton method attempting to rotate instead of warping the triangle?
Best Answer
I'm not sure what you mean by "warping the triangle", but...
Clearly your cost function attains its local minima of $0$ at exactly the points $(x_1, x_2, x_3)$ for which the $x$ and $y$ components of $n$ are zero. So any solution necessarily will have all of $x_1, x_2, x_3$ all lying in some horizontal plane $z = C$. The Newton optimizer is simply converging to one of these solutions.
You of course have the issues that
but I believe that first paragraph answers your explicit question.