$f(x) = (x_1-x_2^2)(x_1-\frac{1}{2}x_2^2)$, verify $\overline{x} = (0,0)^t$ is local min of $\phi(\lambda) = f(\overline{x}+\lambda d)$ but not of $f$

derivativesmaxima-minimamultivariable-calculusoptimization

Let $f(x) = (x_1-x_2^2)(x_1-\frac{1}{2}x_2^2)$. Verify that
$\overline{x} = (0,0)^t$ is a local minimizer of $\phi(\lambda) =
f(\overline{x}+\lambda d)$ for all $d\in\mathbb{R}^2$ but
$\overline{x}$ is not a local minimizer of $f$

$$\frac{\partial f}{\partial x_1} = 1\left(x_1-\frac{1}{2}x_2^2\right)+(x_1-x_2^2)$$

$$\frac{\partial f}{\partial x_2} = -2x_2\left(x_1-\frac{1}{2}x_2^2\right)-x_2^2(x_1-x_2^2)$$

$$\frac{\partial^2 f}{\partial x_1^2} = 2$$

$$\frac{\partial^2 f}{\partial x_2^2} = 6x^2$$

$$\frac{\partial^2 f}{\partial x_2\partial x_1} = -3x_2$$

$$\nabla^2 f = \begin{bmatrix}
2 & -3x_2 \\
-3x_2 & 6x_2^2
\end{bmatrix}$$

$$\nabla^2 f(0,0) = \begin{bmatrix}
2 & 0 \\
0 & 0
\end{bmatrix}$$

$$\begin{bmatrix}
a & b \\
\end{bmatrix}\begin{bmatrix}
2 & 0 \\
0 & 0
\end{bmatrix}\begin{bmatrix}
a \\
b
\end{bmatrix} = a^2$$

for $(0,0)$ to be a local minimizer we should have $\nabla f = 0$ (we have) and $\nabla^2\ge 0$. But $\nabla^2$ is not always semipositive definite so $(0,0)$ is not a local minimizer of $f$.

Now let's analyze $f((0,0)+\lambda d)$.

$$\nabla(\lambda d_1,\lambda d_2) = \begin{bmatrix}
\lambda d_1-\frac{1}{2}\lambda^2d_2^2 + \lambda d_1 -\lambda^2d_2^2 \\
-2\lambda d_2(\lambda d_1-\frac{1}{2}\lambda^2 d_2^2)-\lambda d_ 2^2(\lambda d_1-\lambda^2d_2^2)
\end{bmatrix}\neq 0$$

and the hessian gives an even worse thing. So I think instead I should analyze

$$\phi(\lambda) = f(\lambda d_1,\lambda d_2) = (\lambda d_1-\lambda^2 d_2^2)(\lambda d_1-\frac{1}{2}\lambda^2 d_2^2)$$

but what does it mean for $\overline{x}$ to be a local minimizer of something that depends on $\lambda$?

UPDATE:

I think I have to minimize

$$\phi(\lambda) = f(\lambda d_1,\lambda d_2) = (\lambda d_1-\lambda^2 d_2^2)(\lambda d_1-\frac{1}{2}\lambda^2 d_2^2) = \\\lambda^2 d_1^2 -\frac{1}{2}\lambda^3d_1d_2^2-\lambda^3d_2^2 d_1 + \frac{1}{2}\lambda^4d_2^4\implies \\ \phi'(\lambda) =
2d_1^2\lambda -\frac{3}{2}\lambda^2d_1d_2^2 -3\lambda^2d_2^2d_1 + 2\lambda^3d_2^4 = 0\implies \\2d_2^4\lambda^2+\lambda\left(-\frac{9}{2}d_2^2d_1\right) + 2d_1^2 = 0\implies\\ \lambda = \frac{\frac{9}{2}d_2^2d_1\pm\sqrt{\ (\frac{9}{2}d_2^2d_1)^2 -4\cdot 2d_2^4\cdot2d_1^2}}{4d_2^4}, \lambda = 0$$

It's unpratical to test these values on the function to see which of them is smaller. And even with this I don't know how to proceed. Maybe I should test the second derivative too, I don't know what it means to prove $\overline{x} = (0,0)$ is a minimizer of $\phi(\lambda) = f(\overline{x}+\lambda d)$. The only possible thing I can image is $\lambda$ being $0$ giving us the minimum of $\phi(\lambda)$.

So since we know $\lambda=0$ gives us derivative $0$, let's see how the second derivative is at this point (if it's positive, then $\lambda=0$ is a minimizer):

$$\phi''(\lambda) = 6\lambda^2 d_2^4 -6\lambda d_2^2 d_1 -\frac{6}{2}\lambda d_1d_2^2 + 2d_1^2 \implies\\\phi''(0) = 2d_1^2$$

So for any direction with $d_1\neq 0$, we'll have $\lambda=0$ as minimizer. What about when the direction is $(0,d_2)$ for any $d_2$? In this case we have $$\phi(\lambda) = (0-(\lambda d_2)^2)(0-\frac{1}{2}(\lambda d_2)^2 ) = \frac{1}{2}\lambda^4d_2^4$$

for which $\lambda= 0$ is also a minimizer.

So in all cases $\lambda=0$ is a minimizer of $f((0,0) + \lambda (d_1,d_2))$ which means $(0,0)$ is a minimizer of $\phi(\lambda)$ I guess?

Best Answer

This problem has a nice graphical solution.

Note that f(0)=0. If both of the factors $x_1-x_2^2$ and $x_1-\frac12x_2^2$ have the same sign then $f(x)>0$. If the signs differ then $f(x)<0$. Divide the $x_2x_1$-plane into three regions:

(i) if $x_1>x_2^2$ then $x_1>\frac12x_2^2$ and $f(x)>0$. This is the area in the plane above the quadratic $x_1=x_2^2$ (you let $x_1$ be the vertical axis and $x_2$ the horizontal).

(ii) if $x_1<\frac12 x_2^2$ then $x_1<x_2^2$ and $f(x)>0$. This is the area below $x_1=\tfrac12x_2^2$.

(iii) if $x_2^2>x_1>\frac12 x_2^2$ then $f(x)<0$. This is the area above the curve $x_1=\frac12x_2^2$ but below $x_1=x_2^2$.

The curve $x_1=\frac23x_2^2$ lies in (iii). On it, the objective value is strictly negative, $$(x_1-x_2^2)(x_1-\frac12x_2^2)=-\frac12x_2^2\frac16x_2^2.$$ Hence the origin with objective value 0 cannot be a minimum. Note however that you can only approach the origin from inside (iii) along a curved trajectory. If you consider a series of perturbations of the origin that all lie on a line, as in the case of $\phi(\lambda)$, then all points in some small ball around the origin will lie in region (i) or (ii). Graph the three regions and you will see this clearly.