[Math] Relationship between matrix 2-norm and orthogonal basis of eigenvectors

eigenvalues-eigenvectorsinner-productslinear algebranormed-spaces

Given the following matrix: $$ A = \left(
\begin{array}{cc}
3 & 4 \\
0 & 5 \\
\end{array}
\right)$$

calculate $\|A\|_2$, with $\|A\|_2 = max_{x \in \mathbb{R}^2 -\{0\}} \frac{\langle Ax,Ax \rangle}{\langle x,x\rangle}$.

Hint: Calculate an orthonormal basis consisting of eigenvectors of $A^TA$ and express $x \in \mathbb{R}^2$ in terms of that basis.

Can anybody tell me please how an orthonormal basis and the value for which $\frac{\langle Ax,Ax \rangle}{\langle x,x\rangle}$ is largest relate to each other? How does the vector x expressed in terms of that orthonormal basis of eigenvectors help me find the desired x?

Best Answer

First off, I believe you are missing a square root in the definition since it is the $2$-norm.

The reason it is asking you to find such a basis is to make it easier to compute the norm since $\langle Ax,Ax\rangle = \langle x, A^T A x \rangle$. Suppose $\{e_1,e_2\}$ is the ONB of eigenvectors of $A^T A$ with corresponding eigenvalues $\lambda_1$ and $\lambda_2$, and suppose $x$ is represented as $x = a e_1 + b e_2$ against the basis. Then $$ \begin{align*} \dfrac{\langle Ax,Ax\rangle}{\langle x,x\rangle} &= \dfrac{\langle x,A^T A x\rangle}{\langle x,x\rangle} = \dfrac{\langle a e_1 + b e_2, A^TA(a e_1 + b e_2)\rangle}{\langle a e_1 + b e_2,a e_1 + b e_2 \rangle}\\ &= \dfrac{\langle a e_1 + b e_2, \lambda_1 a e_1 + \lambda_2 b e_2\rangle}{\langle a e_1 + b e_2,a e_1 + b e_2 \rangle}\\ &= \dfrac{\lambda_1 a^2 + \lambda_2 b^2}{a^2 + b^2} \end{align*} $$ and so you need to maximize this over $a$ and $b$, which is easier than just writing $$x = \left(\begin{array}{c} x_1 \\ x_2 \end{array}\right)$$ and computing directly. The top of the fraction turns into a huge mess.

Now, note first that this could have been done for even larger square matrices except it would be $x = a_1 e_1 + a_2 e_2 + \cdots + a_n e_n$ and the computation would follow out similiarly to get $$ \dfrac{\lambda_1 a_1^2 + \lambda_2 a_2^2 + \cdots \lambda_n a_n^2}{a_1^2 + a_2^2 + \cdots + a_n^2}. \quad (**) $$ Also, notice that this did not rely on anything about $A$ except that it was square, so it's true for all square matrices $A$. Finally, the maximum of $(**)$ is obtained by taking the coefficient of the largest eigenvalue to be one and the rest being zero. For instance, if $\lambda_1 > \lambda_2$, then taking $a = 1$ and $b = 0$ gives the maximum value. So, as said in the comments, the maximum is the square root of the largest eigenvalue and since $a = 1$ and $b = 0$ (in the case $\lambda_1 > \lambda_2$), then the eigenvector corresponding to said eigenvalue gives the maximum.