Linear Algebra Done Right – Understanding Sheldon Axler’s Proof Of Theorem 8.23

eigenvalues-eigenvectorsgeneralized eigenvectorlinear algebra

I'm currently going through Sheldon's Axler's Linear Algebra Done Right and am struggling to understand his proof of Theorem 8.23.

Suppose we are given a complex vector space $V$ and a linear transformation $T \in L(V)$. Let $\lambda_1, \dots, \lambda_m$ be the distinct eigenvalues of $T$, and let $U_1, \dots, U_m$ be the corresponding subspaces of the generalized eigenvectors. Theorem 8.23 claims it must be the case that $V = U_1 \oplus\dots\oplus U_m$.

The full proof can be found here, listed as Theorem 8.23 on page 174.

In his proof, Axler defines $U = U_1 + \dots + U_m$ and a function $S = T|_U$. He then claims that $S$ has the same eigenvalues, with the same multiplicities as $T$ because all the generalized eigenvectors of $T$ are in $U$, the domain of $S$.

I don't understand why this is the case. Consider a generalized eigenvector $v$ of $T$, such that
$$
v \in \operatorname{null}(T – \lambda_i I)^{\dim V}.
$$

To be a generalized eigenvector of $S$, that would mean $v \in \operatorname{null}(T – \lambda_i I)^{\dim U}$. I don't understand how this follows from the previous statement.

Thanks in advance.

Best Answer

The identity $U_j = \operatorname{Null} (T - \lambda_j I)^{\dim V}$ is useful when you want to talk about $U_j$ as a nullspace like for instance you can apply the proposition which says $\operatorname{Null} p(T)$ is $T$-invariant.

But this isn't always how you want to think about generalized eigenvectors. Here are some other important facts:

  1. generalized eigenvectors satisfy $(T - \lambda_j)^k v = 0$ for some $k$
  2. each generalized eigenvector has an "order" (for lack of a better term) which is the minimum $k$ that works. Regular eigenvectors are order 1.
  3. the maximum order is less than $\dim V$ which is why we get the identity $U_j = \operatorname{Null} (T - \lambda_j I)^{\dim V}$
  4. if $v$ is order $k$ then $(T - \lambda_j I)v$ is order $k - 1$
  5. if $v$ is order $k$, then $v, (T - \lambda_j I)v, (T - \lambda_j I)^2v, \dots, (T - \lambda_j I)^{k - 1}v$ are linearly independent. I believe this is shown in Lemma 8.40

This last fact implies that $\dim U_j \ge k$ for any order $k$ because a generalized eigenvector of order $k$ yields $k$ linearly independent generalized eigenvectors of orders $1$ through $k$. So actually, we have $U_j = \operatorname{Null} (T - \lambda_j I)^{\dim U_j}$ and this is independent of $V$.

I don't think Axler was thinking quite in these terms: just that if $(T - \lambda_j I)^k v = 0$ then $(S - \lambda_j I)^k v = 0$ because $S$ is just $T$ but with a restricted domain.

And if you approach this as: a generalized eigenvector means "there is some order $k$" then it will make sense because that doesn't depend on $V$ a priori. We can show that $k \le \dim U_j \le \dim U \le \dim V$ but those identities are consequences, not definitions: $$ U_j = \operatorname{Null}(T - \lambda_j I)^{\dim U_j} = \operatorname{Null}(T - \lambda_j I)^{\dim U} = \operatorname{Null}(T - \lambda_j I)^{\dim V}. $$