[Math] How does the method of Lagrange multipliers fail (in classical field theories with local constraints)

examples-counterexamplesfunctional-analysismultivariable-calculusoptimization

The method of Lagrange multipliers is used to find the extrema of $f(x)$ subject to the constraints $\vec g(x)=0$, where $x=(x_1,\dots,x_n)$ and $\vec g=(g_1,\dots,g_m)$ for $m \leq n$.

Although many textbooks get the final equations by arguing that at an extrema, the variation of $f(x)$ must be orthogonal to the surface $g(x)=0$, the "simpler" approach (and that which is commonly seen in field theory / optimizing functionals) is to construct the Lagrange function
$$ L(x,\lambda) = f(x) + \vec\lambda\cdot\vec g(x) $$
and varying w.r.t. $x$ and $\lambda$ to get the vector equations
$$
\begin{align}
&x:& 0 &= \nabla f(x) + \sum_i \lambda_i \nabla g_i(x) \,, \\
&\vec \lambda:& 0 &= \vec g(x) \ .
\end{align}
$$

The method only works if the extremal point is a regular point of the constraint surface, i.e. if $\mathrm{rnk}(\nabla\vec g) = m$.

What is the best way of understanding what goes wrong when the extrema is not a regular point of the constraint?

And, most importantly to me, how does this generalize to field theories (i.e. optimizing functionals) with local constraints? What is the equivalent regularity condition for constraints in field theory?

Instructive examples are more than welcome.

Best Answer

Generically, the $m$ equations $g_i(x)=0$ define a manifold $S$ of dimension $d:=n-m$. At each point $p\in S$ the $m$ gradients $\nabla g_i(p)$ are orthogonal to the tangent space $S_p$ of $S$ at $p$. The condition rnk$(\nabla g(p))=m$ means that these $m$ gradients are linearly independent, so that they span the full orthogonal complement $S_p^\perp$ which has dimension $m=n-d$. At a conditionally stationary point $p$ of $f$ the gradient $\nabla f(p)$ is in $S_p^\perp$, and if the rank condition is fulfilled, there will be constants $\lambda_i$ such that $\nabla f(p)=\sum_{i=1}^m \lambda_i\nabla g_i(p)$. In this case the given "recipe" will find the point $p$.

Consider now the following example where the rank condition is violated: The two constraints $$g_1(x,y,z):=x^6-z=0,\qquad g_2(x,y,z):=y^3-z=0$$ define a curve $S\subset{\mathbb R}^3$ with the parametric representation $$S: \quad x\mapsto (x,x^2,x^6)\qquad (-\infty < x <\infty).$$ The function $f(x,y,z):=y$ assumes its minimum on $S$ at the origin $o$. But if we compute the gradients $$\nabla f(o)=(0,1,0), \qquad \nabla g_1(o)=\nabla g_2(o)=(0,0,-1),$$ it turns out that $\nabla f(o)$ is not a linear combination of the $\nabla g_i(o)$. As a consequence Lagrange's method will not bring this conditionally stationary point to the fore.