Matrix and the method of Lagrange multipliers

lagrange multiplierlinear algebramultivariable-calculusquadratic programming

In a book on linear algebra, I saw a brief description of the Lagrange multiplier.

The $x$ that minimizes $P(x)=\frac{1}{2}x^TAx-x^Tb$ is the solution of $Ax=b$. If $P(x)=x_1^2-x_1x_2+x_2^2-b_1x_1-b_2x_2$, then $\partial P/\partial x_1=0,\ \partial P/\partial x_2=0$ means the matrix equation $Ax=b$ where $$A=\begin{pmatrix} 2 & -1 \\ -1 & 2 \end{pmatrix}$$ Now with the new constraint $Cx=d$. Using the new unknown $y_i$, we can make an expression so that $\partial \mathcal{L}/\partial y=0$ becomes $Cx=d$:
$$\mathcal{L}(x,\ y)=\frac{1}{2}x^TAx-x^Tb+x^TC^Ty-y^Td$$
Then the minimum value of $P$ is
$$P_{\min}=-\frac{1}{2}b^TA^{-1}b+\frac{1}{2}y^T(CA^{-1}b-d)$$


I don't understand this part properly. My question is as follows.

  • When $P$ is different from the above, if $P$ satisfies $\partial P/\partial x_i=0$, can I always write $P(x)=\frac{1}{2} x^TAx-x^Tb$? Is this related to $A$ being a second difference matrix?
  • How do I derive the second term of $P_{\min}$? Can it be induced to $Ax+C^Ty=b$? I'm not familiar with matrix differentiation.

Best Answer

$P$ is bounded below iff $A$ is semi definite. In this case, since the problem is convex, the solution is any $x$ that satisfies ${\partial P(x) \over \partial x} = (Ax-b)^T = 0$. If $A>0$ then the solution is unique and given by $x = A^{-1} b$ and the $\min$ value of $P$ given by $-{1 \over 2} b^T A^{-1} b $.

One way of dealing with the $Cx=d$ constraint is to consider the Lagrangian $L(x,\lambda) = P(x) + \lambda^T (Cx-d) = {1 \over 2} x^T Ax -(b-C^T \lambda)^T x - \lambda^T d $.

Note that the map $x \mapsto L(x,\lambda)$ still has the same quadratic term as the original problem ($\lambda = 0$) and hence the minimising $x$ solves $Ax = b-C^T \lambda$. Substituting shows that the $\min$ value is $-{1 \over 2} (b-C^T\lambda)^T A^{-1} (b-C^T\lambda) - \lambda^T d $.

Unwinding this (assuming I made no mistakes, which is unlikely) gives: $-{1 \over 2} b^T A^{-1} b -{1 \over 2} \lambda^T C A^{-1} C^T \lambda +\lambda^T (C A^{-1}b -d)$