On the Cayley-Hamilton theorem

cayley-hamiltoneigenvalues-eigenvectorslinear algebralinear-transformationsmatrices

One of the nicest theorems in linear algebra is the one that a matrix satisfies its own characteristic polynomial, the so-called Cayley-Hamilton theorem.

What is a good way to prove it? In particular, does this elegant proof go through.

I am hopeful that it is quite trivial. Namely, since the characteristic polynomial is $\rm{det}(A-\lambda I)$, if we plug in $A$ for $\lambda$, we of course get $\rm{det}0=0$.

If so, this seems like one of the easiest times a couple of mathematicians got away with a major theorem.

To be precise, is there any problem with replacing $\lambda$, which usually denotes a scalar, with the matrix in question $A$.

Best Answer

"The" proof of the Cayley-Hamilton Theorem involves invariant subspaces, or subspaces that are mapped onto themselves by a linear operator. If $T$ is a linear operator on a vector space $V$, then a subspace $W\subseteq V$ is called a $T$-invariant subspace of $V$ if $T(W)\subseteq W$, i.e. if $T(v)\in W$ for every $v\in W$. Some examples of $T$-invariant subspaces you might be familiar with are $\{0\}, N(T), R(T), V$, and $E_\lambda$ for any eigenvalue $\lambda$ of $T$. For a linear operator $T$ and any nonzero $x\in V$, then the subspace $$ W=\textrm{span}(\{x,T(x),T^2(x),\dots\})$$ is called the $T$ cyclic subspace of $V$ generated by $x$, and one can show that $W$ is the smallest $T$-invariant subspace containing $x$. Cyclic subspaces can be used to establish the Cayley-Hamilton Theorem. In fact, the existence of a $T$-invariant subspace allows us to define a new linear operator whose domain is this subspace, i.e. the restriction $T_W$ of $T$ to $W$ is a linear operator from $W$ to $W$. These two operators are linked in the sense that the characteristic polynomial of $T_W$ divides the characteristic polynomial of $T$. You can show this by choosing your favorite ordered basis for $W$ and extending it to an ordered basis for $V$, then taking the matrix representations of $T$ and $T_W$, and computing the characteristic polynomial of $T$, one will see that the characteristic polynomial of $T_W$ can be recovered.

The last tool we will need is how to gain information about the characteristic polynomial of $T$ from the characteristic polynomial of $T_W$. Cyclic subspaces are useful in this sense because the characteristic polynomial of the restriction of a linear operator $T$ to a cyclic subspace can be computed. In fact, if $T$ is a linear operator on a finite-dimensional vector space $V$, then if $W$ is the $T$ cyclic subspace of $V$ generated by a nonzero $v\in V$, and letting $k=\textrm{dim}(W)$, then we have that:

  1. $\{v,T(v),T^2(v),\dots,T^{k-1}(v)\}$ is a basis for $W$
  2. If $a_0v+a_1T(v)+\cdots+a_{k-1}T^{k-1}(v)+T^k(v)=0$, then the characteristic polynomial of $T_W$ is $f(t)=(-1)^k(a_0+a_1t+\cdots+a_{k-1}t^{k-1}+t^k)$

I will omit the proof for the above theorem unless requested, since the main goal is the proof of the Cayley-Hamilton Theorem, which states that:

Let $T$ be a linear operator on a finite-dimensional vector space $V$, and let $f(t)$ be the characteristic polynomial of $T$. Then $f(T)=T_0$, the zero transformation. That is, $T$, "satisfies" its characteristic equation.

Proof: To show that $f(T)(v)=0$ for all $v\in V$. If $v=0$, we are done since $f(T)$ is linear, so suppose $v\neq 0$, and let $W$ be the $T$-cyclic subspace generated by $v$ with dimension $k$. By the theorem above, there exist scalars $a_0,\dots,a_{k-1}$ such that $$a_0v+a_1T(v)+\cdots+a_{k-1}T^{k-1}(v)+T^k(v)=0 $$ and the characteristic polynomial for $T_W$ is: $$ g(t)=(-1)^k(a_0+a_1t+\cdots+a_{k-1}t^{k-1}+t^k)$$ Combining these two inequalities yields: $$g(T)(v)=(-1)^k(a_0I+a_1T+\cdots+a_{k-1}T^{k-1}+T^k)(v)=0 $$ We know that this polynomial divides the characteristic polynomial of $T$, $f(t)$, thus there exists a polynomial $q(t)$ such that $f(t)=q(t)g(t)$, so: $$ f(T)(v)=q(T)g(T)(v)=q(T)(g(T)(v))=q(T)(0)=0$$ The Cayley-Hamilton Theorem for Matrices is then a corollary to the Cayley-Hamilton Theorem stated above.