[Math] $L^2$ norm of a matrix: Is this statement true

linear algebramatricesmatrix-normsnormed-spacesspectral-norm

I am following Nocedal and Wright's Numerical Optimization book for self study. In the Appendix section of the book, the following matrix norms are defined:
enter image description here

They defined the $l2$ norm of the matrix $A$ as the largest eigenvalue of $(A^TA)^{1/2}$.

But I have also seen the following definition:
$||A||_2 =\max_{i:n} \sqrt\lambda_i$ where $\lambda_i$ is the i. eigenvalue of the matrix $A^TA$.

(source: http://www.maths.lth.se/na/courses/FMN081/FMN081-06/lecture6.pdf)

I am not sure how these two definitions are equal. $A^TA$ is a symmetric positive definite matrix, hence it has positive eigenvalues. Assume that $\lambda_i$ is its largest eigenvalue. $A^TA$ has a unique positive definite square root with the eigenvalues $\sqrt{\lambda_i}$. Considering only this PD square root matrix, Nocedal's definition is correct. But there can be other square root matrices of $A^TA$ as well, for which different eigenvalues are the largest. And if $A^TA$ has repeating eigenvalues, it will have infinitely many square roots. Hence I think there is an ambiguity in the Nocedal's definition. Am I missing something here? How can be the book's definition correct?

Best Answer

To avoid any ambiguity in the definition of the square root of a matrix, it is best to start from $\ell^2$ norm of a matrix as the induced norm / operator norm coming from the $\ell^2$ norm of the vector spaces. So in your case it seems that $A\in \mathbb{R}^{m\times n}$. Then, it holds by the definition of the operator norm

$$ \lVert A \rVert_2 = \lVert A \rVert_{\ell^2(\mathbb{R}^n) \to \ell^2(\mathbb{R}^m)} = \sup_{x\in \mathbb{R^n}} \frac{ \lVert A x \rVert_{\ell^2(\mathbb{R}^m)}}{\lVert x \rVert_{\ell^2(\mathbb{R}^n)}} $$

By taking the square and expanding the norm to the $\ell^2$-scalar product, one arrives at the Rayleigh quotient of $A^T A$

$$ \lVert A \rVert_2^2 = \sup_{x\in \mathbb{R}^n} \frac{ \lVert A x \rVert_{\ell^2(\mathbb{R}^m)}^2}{\lVert x \rVert_{\ell^2(\mathbb{R}^n)}^2} = \sup_{x \in \mathbb{R}^n} \frac{ \langle x, A^T A x\rangle_{\ell^2(\mathbb{R}^m)}}{\langle x , x\rangle_{\ell^2(\mathbb{R}^n)}} = \lambda_{\max}(A^T A) . $$

Related Question