Eigenvalues – Understanding Stationary Properties of Eigenvectors

eigenvalues-eigenvectorslagrange multiplierlinear algebramatricesstationary point

Hi I am having a bit of trouble understand this line of text regarding the stationary points of eigenvectors.

The text

Given the quadratic form:
$$\langle \ \mathbf{x} \ | \ A\mathbf{x} \ \rangle = Q(\mathbf{x}) $$

To find values of x such that Q(x) is a maximum or minimum:

(i) Q is stationarly for small variations $\Delta \mathbf{x}$

(ii) $\langle \ \mathbf{x} \ | \ \mathbf{x}\ \rangle = 1$

Using lagrange multipliers, we obtain:

(1) $\Delta[x^TAx-\lambda(x^Tx-1)]$

Then, with the fact that $(\Delta x)^T Ax=x^TA\Delta x$ (2) (A is symmetric for quadatric forms)

we obtain (3) $Ax = \lambda x$

My issues:

1.) I find these really confusing, isn't the point of the Lagrange multiplier to find stationary points for the function $f(x) = \lambda g(x)$ by using:

$$\nabla f(x) = \lambda \nabla g(x)$$

How is this equation linked to (1)?

2.) How do we combine (1) and (2) to obtain (3)? There is no $\Delta x$ in equation (1), so how is (2) relevant?

Thanks in advance!

Best Answer

You have $Q(x) = \langle x,Ax \rangle$ and the function $g(x) = \langle x,x \rangle -1$, and you want to find extrema of $Q$ when restricted to the level set $g^{-1}(\{0\})$. First, of all, you can express the standard real-inner product using transpose as well: \begin{equation} Q(x) = x^t A x \qquad \text{and} \qquad g(x) = x^tx - 1 \end{equation} Now, you want to set $\nabla (Q - \lambda g)(x) = 0$, and find the points $x$ which satisfy this equation. (your equation (1) left out the "$=0$", and uses $\Delta$ rather than $\nabla$)

As a personal preference, I'd much rather work with the notation of inner products rather than transpose. So, the gradient of $Q$ at $x$ is the following $1 \times n$ matrix: \begin{equation} \nabla Q(x) = 2 \left(\langle Ax,e_1\rangle, \dots, \langle Ax,e_n \rangle \right) \end{equation} You can derive this either by calculating each $\dfrac{\partial Q}{\partial x_i}(x)$ by writing out $Q(x)$ entirely in terms of components, or in any other way you like, but in any case, you will have to use the fact that $A^t = A$ somewhere. Also, we have that \begin{equation} \nabla g(x) = 2 \left(\langle x,e_1\rangle, \dots, \langle x,e_n \rangle \right) \end{equation} This can be simplified to $2(x_1, \dots, x_n)$, but leaving it like this makes the next step clearer (step (3)).

The lagrange multipliers method now says to set $\nabla Q(x) - \lambda \nabla g(x) = 0$. If you do this, you will notice that for every $1 \leq i \leq n$, we have \begin{equation} \langle Ax - \lambda x, e_i \rangle = 0 \end{equation} This says that for every $i$, the $i^{th}$ component of $Ax - \lambda x$ is $0$. Thus, it follows that \begin{equation} Ax - \lambda x = 0. \end{equation} Now, since $x$ lies in the level set $g^{-1}(\{ 0\}),$ it follows that $\lVert x \rVert = 1$, so that in particular, $x \neq 0$. This tells you $x$ is in fact an eigenvector of $A$.

So, as a recap, the maxima and minima of $Q$ on the level set of $g$ are attained when you evaluate $Q$ on an eigenvector of $A$. You can in fact show that if $Q(\xi) = \lambda \xi$ is the maximum value of $Q$ on the level set, then $\lambda$ is the largest eigenvalue of $A$, and likewise for the minimum.

Related Question