[Math] Trace of adjugate

abstract-algebralinear algebramatrices

Let $A \in M(\mathbb{R}, n)$ and $C_A(\lambda)$ its characteristic polynomial. Let

$$ \Gamma_A(\lambda) := (-1)^{n+1}\frac{C_A(\lambda) – C_A(0)}{\lambda} $$

Then, $\text{adj}(A) = \Gamma_A(A)$ which is a form of the Cramer's rule $\text{adj}(A)A = \det(A) I$ using the Cayley-Hamilton theorem.

There is an identitiy that

$$ \text{tr} ( \text{adj} (A) ) = (-1)^{n+1} \Gamma_A(0)$$

Question: how is this identity derived?

Background: I was studying Lemma 1.4 of this book on page 86, claim 6.

Best Answer

Here is, I think, a possible answer.

From the Jacobi's identity, it follows that

$$ - \frac{d}{d \lambda} C_A(\lambda) = \frac{d}{d \lambda} \det ( \lambda I - A) = \text{tr} \left( \text{adj} ( \lambda I - A ) \frac{d}{d \lambda} ( \lambda I - A) \right) = \text{tr} ( \text{adj} ( \lambda I - A ) ) $$

Therefore,

$$ \frac{d}{d \lambda} C_A(\lambda) \Bigg|_{\lambda=0} = \text{tr} ( \text{adj} ( A ) )$$

Observe that

$$ \frac{d}{d \lambda} C_A(\lambda) \Bigg|_{\lambda=0} = (-1)^{n+1} \lim_{\lambda \rightarrow 0} \frac{C_A(\lambda) - C_A(0)}{\lambda} = (-1)^{n+1} \lim_{\lambda \rightarrow 0} \Gamma_A(\lambda) = (-1)^{n+1} \Gamma_A(0)$$

Therefore,

$$ (-1)^{n+1} \Gamma_A(0) = \text{tr} ( \text{adj} ( A ) ) $$

Related Solutions

[Math] Is the proof of this lemma really necessary

The reason we need the lemma is that from $P(t)=b(t)(A-tI)$ one cannot directly conclude that $P(A)=b(A)(A-AI)$.

If $R$ is a commutative ring, then there is a natural map $R[t]\to R^R$ which is a ring homomorphism (we endow $R^R$ with the pointwise ring structure: $(f+g)(r) = f(r)+g(r)$, and $fg(r) = f(r)g(r)$ for every $r\in R$). If $p(t)=q(t)s(t)$, then for every $r\in R$ you have that $p(r)=q(r)s(r)$.

But this doesn't work if $R$ is not commutative. For example, taking $p(t) = at$, $q(t) = t$ and $s(t)=a$, you have $p(t)=q(t)s(t)$ in $R[t]$ (since $t$ is central in $R[t]$ even when $R$ is not commutative), but $p(r) = ar$ while $q(r)s(r) = ra$. So you get $p(r)=q(r)s(r)$ if and only if $a$ and $r$ commute. Thus, while you can certainly define a map $\psi\colon R[t]\to R^R$ by $$\psi(a_0+a_1t+\cdots+a_nt^n)(r) = a_0 + a_1r + \cdots + a_nr^n,$$ this map is not a ring homomorphism when the ring is not commutative. This is the situation we have here, where the ring $R$ is the ring $n\times n$ matrices over $\mathbb{K}$, which is not commutative when $n\gt 1$. In particular, from $P(t) = B(t)(A-tI)$ one cannot simply conclude that $P(A)=B(A)(A-AI)$. This implicitly assumes that your map $M_n(\mathbb{K})[t]\to M_n(\mathbb{K})^{M_n(\mathbb{K})}$ is multiplicative, which it is not in this case.

If your $A$ happens to be central in $M_n(\mathbb{K})$, then it is true that the induced map $M_n(\mathbb{K})[t]\to M_n(\mathbb{K})$ is a homomorphism. But then you would be assuming that your $A$ is a scalar multiple of the identity. It would also be true if the coefficients of the polynomial $b(t)$ centralize $A$, but you are not assuming that. So you do need to prove that in this case you have $P(A)=b(A)(A-AI)$, since it does not follow from the general set-up (the way it would in a commutative setting).

P.S. In fact, this is the subtle point where the proof that a polynomial over a field of degree $n$ has at most $n$ roots breaks down for skew fields/division rings. If $K$ is a division ring, then the division algorithm holds for polynomials with coefficients over $K$, so one can show that for every $p(t)\in K[t]$ and $a(t)\in K[t]$, $a(t)\neq 0$, there exist unique $q(t)$ and $r(t)$ such that $p(t)=q(t)a(t) + r(t)$ and $r(t)=0$ or $\deg(r)\lt \deg(a)$. From this, we can deduce that for every polynomial $p(t)$ and for every $a\in K$, we can write $p(t) = q(t)(t-a) + r$, where $r\in K$. But the proof of the Remainder and Factor Theorems no longer goes through, because we cannot go from $p(t)=q(t)(t-a)+r$ to $p(a)=q(a)(a-a)+r$; and you cannot get the recursion argument to work, because from $p(t)=q(t)(t-a)$, and $p(b)=0$ with $b\neq a$, you cannot deduce that $q(b)=0$. For instance, over the real quaternions, we have $p(t)=t^2+1=(t+i)(t-i)$, but $p(j)=j^2+1\neq 2k = ij-ji = (j+i)(j-i)$. I remember when I first learned the corresponding theorems for polynomial rings, the professor challenging us to identify all the field axioms used in the proofs of the Remainder and Factor Theorem; none of us spotted the use of commutativity in the evaluation map.

On the Cayley-Hamilton theorem

"The" proof of the Cayley-Hamilton Theorem involves invariant subspaces, or subspaces that are mapped onto themselves by a linear operator. If $T$ is a linear operator on a vector space $V$, then a subspace $W\subseteq V$ is called a $T$-invariant subspace of $V$ if $T(W)\subseteq W$, i.e. if $T(v)\in W$ for every $v\in W$. Some examples of $T$-invariant subspaces you might be familiar with are $\{0\}, N(T), R(T), V$, and $E_\lambda$ for any eigenvalue $\lambda$ of $T$. For a linear operator $T$ and any nonzero $x\in V$, then the subspace $$ W=\textrm{span}(\{x,T(x),T^2(x),\dots\})$$ is called the $T$ cyclic subspace of $V$ generated by $x$, and one can show that $W$ is the smallest $T$-invariant subspace containing $x$. Cyclic subspaces can be used to establish the Cayley-Hamilton Theorem. In fact, the existence of a $T$-invariant subspace allows us to define a new linear operator whose domain is this subspace, i.e. the restriction $T_W$ of $T$ to $W$ is a linear operator from $W$ to $W$. These two operators are linked in the sense that the characteristic polynomial of $T_W$ divides the characteristic polynomial of $T$. You can show this by choosing your favorite ordered basis for $W$ and extending it to an ordered basis for $V$, then taking the matrix representations of $T$ and $T_W$, and computing the characteristic polynomial of $T$, one will see that the characteristic polynomial of $T_W$ can be recovered.

The last tool we will need is how to gain information about the characteristic polynomial of $T$ from the characteristic polynomial of $T_W$. Cyclic subspaces are useful in this sense because the characteristic polynomial of the restriction of a linear operator $T$ to a cyclic subspace can be computed. In fact, if $T$ is a linear operator on a finite-dimensional vector space $V$, then if $W$ is the $T$ cyclic subspace of $V$ generated by a nonzero $v\in V$, and letting $k=\textrm{dim}(W)$, then we have that:

$\{v,T(v),T^2(v),\dots,T^{k-1}(v)\}$ is a basis for $W$
If $a_0v+a_1T(v)+\cdots+a_{k-1}T^{k-1}(v)+T^k(v)=0$, then the characteristic polynomial of $T_W$ is $f(t)=(-1)^k(a_0+a_1t+\cdots+a_{k-1}t^{k-1}+t^k)$

I will omit the proof for the above theorem unless requested, since the main goal is the proof of the Cayley-Hamilton Theorem, which states that:

Let $T$ be a linear operator on a finite-dimensional vector space $V$, and let $f(t)$ be the characteristic polynomial of $T$. Then $f(T)=T_0$, the zero transformation. That is, $T$, "satisfies" its characteristic equation.

Proof: To show that $f(T)(v)=0$ for all $v\in V$. If $v=0$, we are done since $f(T)$ is linear, so suppose $v\neq 0$, and let $W$ be the $T$-cyclic subspace generated by $v$ with dimension $k$. By the theorem above, there exist scalars $a_0,\dots,a_{k-1}$ such that $$a_0v+a_1T(v)+\cdots+a_{k-1}T^{k-1}(v)+T^k(v)=0 $$ and the characteristic polynomial for $T_W$ is: $$ g(t)=(-1)^k(a_0+a_1t+\cdots+a_{k-1}t^{k-1}+t^k)$$ Combining these two inequalities yields: $$g(T)(v)=(-1)^k(a_0I+a_1T+\cdots+a_{k-1}T^{k-1}+T^k)(v)=0 $$ We know that this polynomial divides the characteristic polynomial of $T$, $f(t)$, thus there exists a polynomial $q(t)$ such that $f(t)=q(t)g(t)$, so: $$ f(T)(v)=q(T)g(T)(v)=q(T)(g(T)(v))=q(T)(0)=0$$ The Cayley-Hamilton Theorem for Matrices is then a corollary to the Cayley-Hamilton Theorem stated above.

Best Answer

Related Solutions

[Math] Is the proof of this lemma really necessary

On the Cayley-Hamilton theorem

Related Question