Quartic function in four variables

nonlinear optimizationoptimization

$$\begin{array}{ll} \text{minimize} & x_1^2+ y_1^4+x_2^4+y_2^2+ 8x_1x_2+8y_1y_2\\ \text{subject to} & x_1+y_1=1\\ & x_2+y_2=1\end{array}$$

Is the function convex/ strictly convex?

For critical points:

I did $f_{x_1}= 2x_1+8x_2=0$; $f_{x_2}=4x_2^3+8x_1=0$; $f_{y_1}=4y_1^3+8y_2=0$; $f_{y_2}=2y_2+8y_1=0$

This gives $x_2=0, 2\sqrt{2}, x_1=0, -8\sqrt{2}$, same will be for $y_1, y_2$, could anyone tell me what next?

Lagrange multiplier corresponding to the problem:

$L(x_1,y_1,x_2,y_2, \lambda, \mu)= x_1^2+ y_1^4+x_2^4+y_2^2+ 8x_1x_2+8y_1y_2 + \lambda (x_1+y_1-1)+ \mu(x_2+y_2-1)$

$L_{x_1}= 2x_1+8x_2+\lambda=0$;

$L_{x_2}=4x_2^3+8x_1+\mu=0$;

$L_{y_1}=4y_1^3+8y_2+\lambda=0$;

$L_{y_2}=2y_2+8y_1+\mu=0$

$L_{\lambda}=x_1+y_1-1=0$

$L_{\mu}=x_2+y_2-1=0$

Thanks!

Best Answer

Why not convert your equality constrained problem into an unconstrained problem. \begin{align} x_1 + y_1 = 1 \tag{C1} \label{C1} \\ x_2 + y_2 = 1 \tag{C2} \label{c2} \end{align} Now if we set $x_1 = z_1$, then to satisfy \ref{C1} we must set $y_1 = 1 - z_1$. Similary if we set $x_2 = z_2$ we must set $y_2 = 1-z_2$ to satisfy \ref{c2}. Inserting for $x_1, x_2, y_1, y_2$ yields the unconstrained problem $$ \arg \min g(z_1, z_2) $$ where \begin{equation} g(z_1, z_2) = z_1^2 + (1-z_1)^4 + z_2^4 + (1-z_2)^2 + 8 z_1 z_2 + 8(1-z_1)(1-z_2). \end{equation} We evaluate the gradient of $g$, \begin{align} \frac{\partial g}{\partial z_1} &= 2 z_1 - 4(1-z_1)^3 + 8 z_2 - 8(1-z_2)\\ \frac{\partial g}{\partial z_2} &= 4z_2^3 - 2(1-z_2) + 8z_1 -8(1-z_1) \end{align} and use it to find the critical points of $g$ as the solutions to the nonlinear system \begin{align} \frac{\partial g}{\partial z_1} &= 2 z_1 - 4(1-z_1)^3 + 8 z_2 - 8(1-z_2) = 0 \\ \frac{\partial g}{\partial z_2} &= 4z_2^3 -2 (1-z_2) + 8z_1 -8(1-z_1) = 0 \end{align} Because I'm lazy I solve this in Wolfram Alpha. It has the real solutions $$(z_1, z_2) \in \{(-0.601, 1.601), (0.544, 0.456), (3.056, -2.056)\}.$$ We can compute \begin{align} &g(-0.601, 1.601) = -7.46 \\ &g(0.544, 0.456) = 5.31 \\ &g(3.056, -2.056) = -72.32 \end{align} To classify these critical points we can compute the Hessian and check whether it is symmetric positive definite (or semidefinite). I instead use Wolfram again and visually inspect that $(3.056, -2.056)$ is a local minimum. As our function is coercive I conclude that this also is a global minimum.