Proving a norm limit based on Schur factorization

eigenvalues-eigenvectorslinear algebramatricesmatrix-normsspectral-radius

I am attempting to do exercises from Trefethen's numerical linear algebra. I am stuck on this one

Using Schur decomposition, we are asked to prove that for an arbitrary square matrix $A$ and any matrix norm $\lVert \bullet \rVert$ we have $\lim_{n \to \infty} \lVert A^n \rVert = 0 $ if and only if $\rho(A)<1$, where $\rho$ is the spectral radius.

I think since all norms are topologically equivalent and we are asked to prove the limit is zero, we can arbitrarily look at the 2-norm, we basically need
$\lim_{n \to \infty} \lVert A^n \rVert_2 = 0 $ and if we decompose $A=QTQ^*$ where $Q$ is unitary and $T$ is upper triangular, we have to prove
$\lim_{n \to \infty} \lVert QT^nQ^* \rVert_2 = \lVert T^n \rVert_2 = 0$ since multiplication by a unitary matrix does not change the 2 norm. And again because of topological equivalence, we need to prove $\lim_{n \to \infty} \lVert T^n \rVert = 0$ for any convenient norm. All I know is that the eigenvalues of $T^n$ for any $n$ appear on the diagonal, and they are $n$ powers of the eigenvalues of $T$ (which are those of $A$). I do not know how to show that if the spectral radius of $A$ is less than $1$, then the norm converges to zero as needed.

If I could use Gelfand's formula, it would be immediate, but this formula is not covered in the book and the exercise only asks to use Schur decomposition. I am stuck on this and would appreciate any and all help.

Best Answer

One way to show $\sigma(A)\lt 1$ implies $A^k\to \mathbf 0$ is to combine Schur Triangularization with triangle inequality, topological continuity of eigenvalues and Perron Theory. For convenience I set this up as induction and note the $1\times 1$ base case is obvious (e.g. the operator 2 norm of $A$ to powers 1,2,3,4,... forms a monotone decreasing sequence bounded below by zero...).

Inductive case
we know this is true for all $m\times m$ matrices and need to show it true for $m+1 \times m+1$ matrices.

Let $\vert T \vert$ denote the matrix $T$ with absolute values applied component-wise. (Since $T$ is triangular this doesn't change the spectral radius but that wouldn't hold per se with $A$).

Using triangle inequality for individual dot products, show component-wise $\vert T^2\vert \leq \vert T\vert^2$ and show that in general that this implies
$ \vert T^k \vert \leq \vert T\vert^{k}$

so it suffices to show that $\vert T\vert^k \to \mathbf 0$. Letting $J$ be the ones matrix then $P:= \vert T\vert +\delta J$ for some small enough $\delta \gt 0$ implies $\sigma(P) \lt 1$ by topological continuity of eigenvalues. Now we have
$\vert T\vert^k \lt \vert T\vert^k + (\delta J)^k \leq P^k$ for all $k$ so it suffices to show $P^k\to \mathbf 0$

now $P$ is a positive matrix so it has a dominant positive eigenvalue $\lambda\lt 1$ and associated positive eigenvector $\mathbf v$ (Perron vector). Using the 2 step similarity transform at the end of
$\lambda_{\max}\geq n$ for a positive reciprocal matrix
(except of course step one is done to $P$ not $A$)

we have $P=SCS^{-1}$ where $C$ has some positive vector $\mathbf z$ as its left and right Perron vector which implies $C$ has a Schur Decomposition of $ \begin{bmatrix}\lambda & \mathbf 0\\ \mathbf 0 &T_m \end{bmatrix}$ and $\lambda^k \to 0$ (this is our trivial base case) and $T_m^k \to \mathbf 0$ by induction hypothesis. Thus $C^k \to \mathbf 0$ and using the operator 2 norm: $\big \Vert P^k\big \Vert_2 = \big \Vert SC^kS^{-1}\big \Vert_2\leq \big \Vert S\big \Vert_2 \cdot \big \Vert C^k\big \Vert_2 \cdot \big \Vert S^{-1}\big \Vert_2\to 0$ because $\big \Vert C^k\big \Vert_2 \to 0$ which completes the proof.

note on the Schur Decomposition of $C$
Where $\mathbf z$ is a positive vector that is a left and right eigenvector of $C$ and unitary $U:=\bigg[\begin{array}{c|c|c|c}\mathbf z & \mathbf u_2 &\cdots & \mathbf u_{m+1} \end{array}\bigg] $
$C = URU^{-1} = URU^{*} =U\begin{bmatrix} \lambda & \mathbf x_{m}^*\\ \mathbf 0 & T_m \end{bmatrix}U^* =U\begin{bmatrix} \lambda & \mathbf 0^T\\ \mathbf 0 & T_{m} \end{bmatrix}U^*$ since
$\lambda\cdot\mathbf z^T = \lambda\cdot\mathbf z^* =\mathbf z^* C = \lambda\cdot \mathbf z^* + \sum_{j} x_j\cdot \mathbf u_j^*$ and the columns of $U$ are linearly independent so every $x_j =0$.
Note: this actually implies that $\big\Vert C\big \Vert_2 = \lambda$ which allows for a proof that bypasses induction but takes more work to justify.