$f(x) = (x_1-x_2^2)(x_1-\frac{1}{2}x_2^2)$, verify $\overline{x} = (0,0)^t$ is local min of $\phi(\lambda) = f(\overline{x}+\lambda d)$ but not of $f$

Let $f(x) = (x_1-x_2^2)(x_1-\frac{1}{2}x_2^2)$. Verify that
$\overline{x} = (0,0)^t$ is a local minimizer of $\phi(\lambda) =
f(\overline{x}+\lambda d)$ for all $d\in\mathbb{R}^2$ but
$\overline{x}$ is not a local minimizer of $f$

$$\frac{\partial f}{\partial x_1} = 1\left(x_1-\frac{1}{2}x_2^2\right)+(x_1-x_2^2)$$

$$\frac{\partial f}{\partial x_2} = -2x_2\left(x_1-\frac{1}{2}x_2^2\right)-x_2^2(x_1-x_2^2)$$

$$\frac{\partial^2 f}{\partial x_1^2} = 2$$

$$\frac{\partial^2 f}{\partial x_2^2} = 6x^2$$

$$\frac{\partial^2 f}{\partial x_2\partial x_1} = -3x_2$$

$$\nabla^2 f = \begin{bmatrix}
2 & -3x_2 \\
-3x_2 & 6x_2^2
\end{bmatrix}$$

$$\nabla^2 f(0,0) = \begin{bmatrix}
2 & 0 \\
0 & 0
\end{bmatrix}$$

$$\begin{bmatrix}
a & b \\
\end{bmatrix}\begin{bmatrix}
2 & 0 \\
0 & 0
\end{bmatrix}\begin{bmatrix}
a \\
b
\end{bmatrix} = a^2$$

for $(0,0)$ to be a local minimizer we should have $\nabla f = 0$ (we have) and $\nabla^2\ge 0$. But $\nabla^2$ is not always semipositive definite so $(0,0)$ is not a local minimizer of $f$.

Now let's analyze $f((0,0)+\lambda d)$.

$$\nabla(\lambda d_1,\lambda d_2) = \begin{bmatrix}
\lambda d_1-\frac{1}{2}\lambda^2d_2^2 + \lambda d_1 -\lambda^2d_2^2 \\
-2\lambda d_2(\lambda d_1-\frac{1}{2}\lambda^2 d_2^2)-\lambda d_ 2^2(\lambda d_1-\lambda^2d_2^2)
\end{bmatrix}\neq 0$$

and the hessian gives an even worse thing. So I think instead I should analyze

$$\phi(\lambda) = f(\lambda d_1,\lambda d_2) = (\lambda d_1-\lambda^2 d_2^2)(\lambda d_1-\frac{1}{2}\lambda^2 d_2^2)$$

but what does it mean for $\overline{x}$ to be a local minimizer of something that depends on $\lambda$?

UPDATE:

I think I have to minimize

$$\phi(\lambda) = f(\lambda d_1,\lambda d_2) = (\lambda d_1-\lambda^2 d_2^2)(\lambda d_1-\frac{1}{2}\lambda^2 d_2^2) = \\\lambda^2 d_1^2 -\frac{1}{2}\lambda^3d_1d_2^2-\lambda^3d_2^2 d_1 + \frac{1}{2}\lambda^4d_2^4\implies \\ \phi'(\lambda) =
2d_1^2\lambda -\frac{3}{2}\lambda^2d_1d_2^2 -3\lambda^2d_2^2d_1 + 2\lambda^3d_2^4 = 0\implies \\2d_2^4\lambda^2+\lambda\left(-\frac{9}{2}d_2^2d_1\right) + 2d_1^2 = 0\implies\\ \lambda = \frac{\frac{9}{2}d_2^2d_1\pm\sqrt{\ (\frac{9}{2}d_2^2d_1)^2 -4\cdot 2d_2^4\cdot2d_1^2}}{4d_2^4}, \lambda = 0$$

It's unpratical to test these values on the function to see which of them is smaller. And even with this I don't know how to proceed. Maybe I should test the second derivative too, I don't know what it means to prove $\overline{x} = (0,0)$ is a minimizer of $\phi(\lambda) = f(\overline{x}+\lambda d)$. The only possible thing I can image is $\lambda$ being $0$ giving us the minimum of $\phi(\lambda)$.

So since we know $\lambda=0$ gives us derivative $0$, let's see how the second derivative is at this point (if it's positive, then $\lambda=0$ is a minimizer):

$$\phi''(\lambda) = 6\lambda^2 d_2^4 -6\lambda d_2^2 d_1 -\frac{6}{2}\lambda d_1d_2^2 + 2d_1^2 \implies\\\phi''(0) = 2d_1^2$$

So for any direction with $d_1\neq 0$, we'll have $\lambda=0$ as minimizer. What about when the direction is $(0,d_2)$ for any $d_2$? In this case we have $$\phi(\lambda) = (0-(\lambda d_2)^2)(0-\frac{1}{2}(\lambda d_2)^2 ) = \frac{1}{2}\lambda^4d_2^4$$

for which $\lambda= 0$ is also a minimizer.

So in all cases $\lambda=0$ is a minimizer of $f((0,0) + \lambda (d_1,d_2))$ which means $(0,0)$ is a minimizer of $\phi(\lambda)$ I guess?

Best Answer

This problem has a nice graphical solution.

Note that f(0)=0. If both of the factors $x_1-x_2^2$ and $x_1-\frac12x_2^2$ have the same sign then $f(x)>0$. If the signs differ then $f(x)<0$. Divide the $x_2x_1$-plane into three regions:

(i) if $x_1>x_2^2$ then $x_1>\frac12x_2^2$ and $f(x)>0$. This is the area in the plane above the quadratic $x_1=x_2^2$ (you let $x_1$ be the vertical axis and $x_2$ the horizontal).

(ii) if $x_1<\frac12 x_2^2$ then $x_1<x_2^2$ and $f(x)>0$. This is the area below $x_1=\tfrac12x_2^2$.

(iii) if $x_2^2>x_1>\frac12 x_2^2$ then $f(x)<0$. This is the area above the curve $x_1=\frac12x_2^2$ but below $x_1=x_2^2$.

The curve $x_1=\frac23x_2^2$ lies in (iii). On it, the objective value is strictly negative, $$(x_1-x_2^2)(x_1-\frac12x_2^2)=-\frac12x_2^2\frac16x_2^2.$$ Hence the origin with objective value 0 cannot be a minimum. Note however that you can only approach the origin from inside (iii) along a curved trajectory. If you consider a series of perturbations of the origin that all lie on a line, as in the case of $\phi(\lambda)$, then all points in some small ball around the origin will lie in region (i) or (ii). Graph the three regions and you will see this clearly.

Best Answer

Related Solutions

$f(\overline{x})=0$ but $\nabla^2 f(\overline{x})$ is not semidefinite positive. Prove there exists a descent direction

Prove that the function $f(x) = (x_2-x_1^2)^2 +x_1^5$ has only one stationary point which is not extreme

Related Question