[Math] Is the proof of this lemma really necessary

linear algebra

To prove the Cayley-Hamilton theorem in linear algebra, my professor said that a lemma was necessary:

Lemma: Let $A \in M_n(\mathbb{K})$ be an $n\times n$ matrix over a field $\mathbb{K}$, let $b(t) \in M_n(\mathbb{K})[t]$ and $P(t) = b(t)[A-tI]$, then $P(A) = 0$

The theorem (which says that if $f$ is an endomorphism of V, then $f$ is a solution to its characteristic polynomial), was then proven thus:

let $B(t) = \text{adj}[A-tI]$ and $P(t) = B(t)[A-tI]$, then $P(A)=0$ but also $P(t) = \delta I$ (where $\delta = \det(A-tI)$). Since $\delta = \chi_f(t)$, so $P(A) = 0 \Rightarrow \chi_f(A) = 0$.

My question is: since we interpret the $P(t)$ of the theorem as a polynomial with matrix coefficients, isn't the whole thing kind of obvious for the properties of a polynomial ring? (Assuming we all know how to switch between matrices and endomorphisms)

Best Answer

The reason we need the lemma is that from $P(t)=b(t)(A-tI)$ one cannot directly conclude that $P(A)=b(A)(A-AI)$.

If $R$ is a commutative ring, then there is a natural map $R[t]\to R^R$ which is a ring homomorphism (we endow $R^R$ with the pointwise ring structure: $(f+g)(r) = f(r)+g(r)$, and $fg(r) = f(r)g(r)$ for every $r\in R$). If $p(t)=q(t)s(t)$, then for every $r\in R$ you have that $p(r)=q(r)s(r)$.

But this doesn't work if $R$ is not commutative. For example, taking $p(t) = at$, $q(t) = t$ and $s(t)=a$, you have $p(t)=q(t)s(t)$ in $R[t]$ (since $t$ is central in $R[t]$ even when $R$ is not commutative), but $p(r) = ar$ while $q(r)s(r) = ra$. So you get $p(r)=q(r)s(r)$ if and only if $a$ and $r$ commute. Thus, while you can certainly define a map $\psi\colon R[t]\to R^R$ by $$\psi(a_0+a_1t+\cdots+a_nt^n)(r) = a_0 + a_1r + \cdots + a_nr^n,$$ this map is not a ring homomorphism when the ring is not commutative. This is the situation we have here, where the ring $R$ is the ring $n\times n$ matrices over $\mathbb{K}$, which is not commutative when $n\gt 1$. In particular, from $P(t) = B(t)(A-tI)$ one cannot simply conclude that $P(A)=B(A)(A-AI)$. This implicitly assumes that your map $M_n(\mathbb{K})[t]\to M_n(\mathbb{K})^{M_n(\mathbb{K})}$ is multiplicative, which it is not in this case.

If your $A$ happens to be central in $M_n(\mathbb{K})$, then it is true that the induced map $M_n(\mathbb{K})[t]\to M_n(\mathbb{K})$ is a homomorphism. But then you would be assuming that your $A$ is a scalar multiple of the identity. It would also be true if the coefficients of the polynomial $b(t)$ centralize $A$, but you are not assuming that. So you do need to prove that in this case you have $P(A)=b(A)(A-AI)$, since it does not follow from the general set-up (the way it would in a commutative setting).

P.S. In fact, this is the subtle point where the proof that a polynomial over a field of degree $n$ has at most $n$ roots breaks down for skew fields/division rings. If $K$ is a division ring, then the division algorithm holds for polynomials with coefficients over $K$, so one can show that for every $p(t)\in K[t]$ and $a(t)\in K[t]$, $a(t)\neq 0$, there exist unique $q(t)$ and $r(t)$ such that $p(t)=q(t)a(t) + r(t)$ and $r(t)=0$ or $\deg(r)\lt \deg(a)$. From this, we can deduce that for every polynomial $p(t)$ and for every $a\in K$, we can write $p(t) = q(t)(t-a) + r$, where $r\in K$. But the proof of the Remainder and Factor Theorems no longer goes through, because we cannot go from $p(t)=q(t)(t-a)+r$ to $p(a)=q(a)(a-a)+r$; and you cannot get the recursion argument to work, because from $p(t)=q(t)(t-a)$, and $p(b)=0$ with $b\neq a$, you cannot deduce that $q(b)=0$. For instance, over the real quaternions, we have $p(t)=t^2+1=(t+i)(t-i)$, but $p(j)=j^2+1\neq 2k = ij-ji = (j+i)(j-i)$. I remember when I first learned the corresponding theorems for polynomial rings, the professor challenging us to identify all the field axioms used in the proofs of the Remainder and Factor Theorem; none of us spotted the use of commutativity in the evaluation map.

Related Question