Linear Algebra Done Right – Understanding Sheldon Axler’s Proof Of Theorem 8.23

I'm currently going through Sheldon's Axler's Linear Algebra Done Right and am struggling to understand his proof of Theorem 8.23.

Suppose we are given a complex vector space $V$ and a linear transformation $T \in L(V)$. Let $\lambda_1, \dots, \lambda_m$ be the distinct eigenvalues of $T$, and let $U_1, \dots, U_m$ be the corresponding subspaces of the generalized eigenvectors. Theorem 8.23 claims it must be the case that $V = U_1 \oplus\dots\oplus U_m$.

The full proof can be found here, listed as Theorem 8.23 on page 174.

In his proof, Axler defines $U = U_1 + \dots + U_m$ and a function $S = T|_U$. He then claims that $S$ has the same eigenvalues, with the same multiplicities as $T$ because all the generalized eigenvectors of $T$ are in $U$, the domain of $S$.

I don't understand why this is the case. Consider a generalized eigenvector $v$ of $T$, such that
$$
v \in \operatorname{null}(T – \lambda_i I)^{\dim V}.
$$
To be a generalized eigenvector of $S$, that would mean $v \in \operatorname{null}(T – \lambda_i I)^{\dim U}$. I don't understand how this follows from the previous statement.

Thanks in advance.

Best Answer

The identity $U_j = \operatorname{Null} (T - \lambda_j I)^{\dim V}$ is useful when you want to talk about $U_j$ as a nullspace like for instance you can apply the proposition which says $\operatorname{Null} p(T)$ is $T$-invariant.

But this isn't always how you want to think about generalized eigenvectors. Here are some other important facts:

generalized eigenvectors satisfy $(T - \lambda_j)^k v = 0$ for some $k$
each generalized eigenvector has an "order" (for lack of a better term) which is the minimum $k$ that works. Regular eigenvectors are order 1.
the maximum order is less than $\dim V$ which is why we get the identity $U_j = \operatorname{Null} (T - \lambda_j I)^{\dim V}$
if $v$ is order $k$ then $(T - \lambda_j I)v$ is order $k - 1$
if $v$ is order $k$, then $v, (T - \lambda_j I)v, (T - \lambda_j I)^2v, \dots, (T - \lambda_j I)^{k - 1}v$ are linearly independent. I believe this is shown in Lemma 8.40

This last fact implies that $\dim U_j \ge k$ for any order $k$ because a generalized eigenvector of order $k$ yields $k$ linearly independent generalized eigenvectors of orders $1$ through $k$. So actually, we have $U_j = \operatorname{Null} (T - \lambda_j I)^{\dim U_j}$ and this is independent of $V$.

I don't think Axler was thinking quite in these terms: just that if $(T - \lambda_j I)^k v = 0$ then $(S - \lambda_j I)^k v = 0$ because $S$ is just $T$ but with a restricted domain.

And if you approach this as: a generalized eigenvector means "there is some order $k$" then it will make sense because that doesn't depend on $V$ a priori. We can show that $k \le \dim U_j \le \dim U \le \dim V$ but those identities are consequences, not definitions: $$ U_j = \operatorname{Null}(T - \lambda_j I)^{\dim U_j} = \operatorname{Null}(T - \lambda_j I)^{\dim U} = \operatorname{Null}(T - \lambda_j I)^{\dim V}. $$

Best Answer

Related Solutions

Prove that a restriction $(T − \lambda_j I)|_{U_j}$ is nilpotent.

An invertible linear map over C has a square root (Linear Algebra Done Right 8.33)

Related Question