[Math] Operator Norm of a Linear Transformation of a Matrix

linear algebranormed-spacesordinary differential equations

The book I am using for the ODE course is Differential Equations and Dynamical Systems by Lawrence Perko.

I am having a difficult time understanding what an operator norm of a linear transformation is.

From the book, they define the operator norm of the linear operator $T$ as
\begin{equation}
||T||=\text{max}_{|\vec{x}|\leq 1}|T(\vec{x})|
\end{equation}
where $|\vec{x}|$ denotes the Euclidean norm of $\vec{x}\in \mathbb{R}^n$.

Upon reading, it seems the operator norm has all of the usual properties of the Euclidean norm. I have also come to understand that the operator norm is a means to measure the "size" of certain linear transformation. I assume this is in relation to boundedness?

However, I still don't quite understand what the operator norm of a linear transformation is or what it's purpose it (other than used to define the concept of convergence in a linear space). What stumps me even more is trying to compute the operator norm of any linear transformation, for example

\begin{array}
d \begin{bmatrix}
1&0\\5&1\\
\end{bmatrix}
\end{array}


I'm sorry for the long read, but thank you so much for your time. I appreciate any feedback you may give in helping further my understanding of mathematics.

Best Answer

To some extent, the operator norm is just a way to define a useful structure on the set of linear operators. And, as you've already mentioned, this structure resembles usual Euclidean space: you can add and subtract two operators, multiply them by scalar and measure "how big" is this operator. This is just called a normed vector space. Why one might need this sort of structure will be revealed later when the author will speak about proofs of uniqueness and existence theorems for solutions of ODEs.

But let's get back to your questions.

However, I still don't quite understand what the operator norm of a linear transformation is or what it's purpose it (other than used to define the concept of convergence in a linear space).

Hey, don't deny this purpose, it's a very useful and important one :)
Operator norm also can be used when studying numerical methods, e.g. the concept of condition number can be described using operator norms.

Speaking about what is the operator norm, there are (at least) 3 equivalent definitions of operator norm:

  1. $\| T \| = \max\limits_{x \neq 0} \; \frac{\vert Tx \vert}{\vert x \vert}$
  2. $\| T \| = \max\limits_{\vert x \vert \leqslant 1} \; \vert Tx \vert$
  3. $\| T \| = \max\limits_{\vert x \vert = 1} \; \vert Tx \vert$

So the first definition compares the norm of the image $Tx$ with norm of $x$ among all non-zero vectors of vector space. Hence operator norm means maximal relative stretch.

Last two definitions can be perceived more geometrically. The set $\vert x \vert \leqslant 1$ is a closed unit ball in vector space. Its image is not a ball, it might be stretched along some axis and contracted along others or even transformated in more complicated manner (eigenvalues to the rescue!). Operator norm tells you a closed ball of what size would be enough to contain the whole image of unit ball — and its size is determined by the norm of farthest point of unit ball image. In definition three you just look at the image of unit sphere and everything else is the same.

What stumps me even more is trying to compute the operator norm of any linear transformation, for example

\begin{array} d \begin{bmatrix} 1&0\\5&1\\ \end{bmatrix} \end{array}

I can offer you a simple trick here. There are few ways how to compute matrix norm in low-dimensional cases like 2 or 3 (using the third definition of matrix norm, it'll be a not so hard optimization problem of dimension 1 or 2), but they are not of great utility.

So, we want to find a norm of some matrix $A$. Since we're using standard Euclidean norm, $\vert Ax \vert = \sqrt{(Ax, Ax)}$, where $( \cdot\, , \cdot )$ is standard dot product in $\mathbb{R}^n$. So $(Ax, Ax)$ equals $(Ax)^{\rm T}(Ax) = x^{\rm T} (A^{\rm T}A) x $. The matrix $A^{\rm T}A$ is symmetric and positive semi-definite. We now have that $\| A \| = \max\limits_{\vert x\vert =1} \; \sqrt{x^{\rm T} (A^{T}A) x} $. The solution to the problem $\max\limits_{\vert x\vert =1} \; x^{\rm T} (A^{\rm T}A) x $ is well-known (just replace "Hermitian" with "symmetric" and everything will be fine). So, the task of finding operator norm subordinate to standard Euclidean norm for matrix $A$ is the same as the finding largest eigenvalue of $A^{\rm T}A$ (and it's always guaranteed to be non-negative since $A$ is symmetric and positive semi-definite).