[Math] Showing companion matrix is similar to Jordan block using Jordan-Chevalley decomposition

companion-matricesjordan-normal-formlinear algebramatrices

The Jordan-Chevalley decomposition says that given a linear operator $L$, you can decompose it as $L = S + N$, where $S$ is diagonalizable and $N$ is nilpotent.

My textbook (Linear Algebra by Peterson) has a corollary of the Jordan-Chevalley that says that given $p(t) = (t-\lambda)^n$, the associated companion matrix $C_p$ is similar to a Jordan block (matrix with $\lambda$ on diagonal and $1$'s on superdiagonal).

So if I apply the JC decomposition to $C_p$, I get $C_p = \lambda I_n + (C_p – \lambda I_n) $. So I need to show that $C_p – \lambda I_n$ is similar to the matrix with $1$'s on the superdiagonal. I don't see how to do this without going into the Frobenius Canonical form and showing that the characteristic polynomial and minimal polynomial of $C_p$ are the same. (I.e. Using theorem that says if two operators have same minimal polynomial, then they are similar)

Is there an easy way to do this?

Best Answer

I cannot see why Peterson would want to invoke such a difficult result for such a simple conclusion. And you don't need the Frobenius Canonical form for showing that the characteristic polynomial and minimal polynomial of $C_p$ are the same (see this question and this one), nor indeed do you need the characteristic polynomial at all.

The companion matrix $C_p$ is defined so that $p(C_p)$, but no nonzero polynomial in $C_p$ of lower degree, annihilates the first basis vector $e_1$ (and as a consequence every vector). So the minimal polynomial of $C_p$ is $p=(X-\lambda)^n$, the Jordan normal form $J$ of $C_p$ has only entries $\lambda$ on the diagonal, and $(J-\lambda I)^{n-1}\neq0$, and the latter can only happen if $J$ consists of a single Jordan block (of size $n$).

In fact it is not hard to conjugate $C_p$ to its Jordan normal form explicitly. Define vectors $b_1,\ldots,b_n$ by $b_{n-i}=(C_P-\lambda I_n)^i\cdot e_1$, then $C_p\cdot b_1=\lambda b_1$ (because $(C_p-\lambda I_n)^n=p(C_p)$ annihilates $e_1$) and $C_p\cdot b_i=b_{i-1}+\lambda b_i$ for $i\geq2$, in other words change of basis to the basis $(b_1,\ldots,b_n)$ transforms $C_p$ into a Jordan block of size $n$ for $\lambda$.

The connection between the Jordan-Chevalley decomposition and the Jordan normal form:

As it has already been explained in the comments, the Jordan-Chevalley decomposition of $T$ can be derived from its Jordan canonical form:

Suppose that $\mathcal{B}$ is a basis of $V$ with respect to which the operator $T$ is given by a matrix $[T] \in \operatorname{M}_n(\mathbb{C})$ which is in Jordan normal form, say $$ [T] = \begin{pmatrix} J_{n_1}(\lambda_1) & & \\ & \ddots & \\ & & J_{n_t}(\lambda_t) \end{pmatrix}. $$ (Here the $\lambda_i$ are not necessarily pairwise distinct.) Then with respect to $\mathcal{B}$ the operators $S$ and $N$ are given by the matrices $$ \begin{pmatrix} \lambda_1 I_{n_1} & & \\ & \ddots & \\ & & \lambda_t I_{n_t} \end{pmatrix} \quad\text{and}\quad \begin{pmatrix} J_{n_1}(0) & & \\ & \ddots & \\ & & J_{n_t}(0) \end{pmatrix}. $$

One could also go the other way around, and derive the Jordan normal form of $T$ from its Jordan-Chevalley decomposition:

Every eigenspace $V_\lambda(S)$ is $N$-invariant, since $S$ and $N$ commute. Since $N$ is nilpotent, the same goes for the restrictions $N|_{V_\lambda(S)}$. Thus we can find for every $\lambda \in \mathbb{C}$ a basis $\mathcal{B}_\lambda$ for $V_\lambda(S)$ with respect to which the operator $N|_{V_\lambda(S)}$ is given by a matrix which is in Jordan normal form, say $[N|_{V_\lambda(S)}] = \bigoplus_{j=1}^{n(\lambda)} J_{n(\lambda,j)}(0)$ (here we use that finite-dimensional nilpotent operators always have a Jordan normal form, and that $0$ is the only eigenvalue of a nilpotent operator).

Since $S$ is diagonalizable we have that $V = V_{\lambda_1}(S) \oplus \dotsb \oplus V_{\lambda_r}(S)$ (with the $\lambda_i$ being pairwise distinct), so it follows that the union $\mathcal{B} := \bigcup_{i=1}^r \mathcal{B}_{\lambda_i}$ is a basis of $V$. With respect to $\mathcal{B}$ the operator $N$ is given by the block diagonal matrix $[N] = \bigoplus_{i=1}^r \bigoplus_{j=1}^{n(\lambda_i)} J_{n(\lambda_i, j)}(0)$, which is again in Jordan normal form, and the operator $S$ is given by the diagonal matrix $[S] = \bigoplus_{i=1}^r \lambda I_{\dim V_{\lambda_i}(S)}$.

So with respect to $\mathcal{B}$ the operator $T = S + N$ is given by the matrix \begin{align*} [T] = [S] + [N] &= \left( \bigoplus_{i=1}^r \bigoplus_{j=1}^{n(\lambda_i)} J_{n(\lambda_i, j)}(0) \right) + \left( \bigoplus_{i=1}^r \lambda I_{\dim V_{\lambda_i}(S)} \right) \\ &= \bigoplus_{i=1}^r \bigoplus_{j=1}^{n(\lambda_i)} J_{n(\lambda_i, j)}(\lambda_i), \end{align*} which is in Jordan normal form.

Alltogether this shows that the Jordan-Chevalley decomposition and the Jordan normal form are equivalent, and how one can be derived from the other. This observation actually holds for arbitary fields:

An operator $T \colon V \to V$ on a finite-dimensional $k$-vector space $V$ has a Jordan-Chevalley decomposition (into commuting diagonalizable and nilpotent parts) if and only if it has a Jordan normal form.

Also note that the decomposition $$ V = \ker(T - \lambda_1 I)^{m_1} \oplus \dotsb \oplus \ker(T - \lambda_k I)^{m_k}. $$ which is used to construct the Jordan-Chevalley decomposition is precisely the generalized eigenspace decomposition, which is used to show the existence of the Jordan normal form.

Regarding the number of Jordan blocks:

I am not very familiar with the Frobenius canonical form which Petersen uses here, but I think I (kind of) understand where the problem comes from, and how to solve it.

You are right that we may need more than one Jordan block if we look at the restriction of $T$ to $\ker (T - \lambda_i)^{m_i}$; the matrix representation of $[T|_{\ker (T - \lambda_i)^{m_i}}]$ consists of all Jordan blocks for the eigenvalues $\lambda$. This is why Petersen further decomposes $\ker (T - \lambda_i)^{m_i}$ into cyclic subspace:

This means that we have reduced the problem to a situation where $T$ has only one eigenvalues. Given the Frobenius canonical form the problem is then further reduced to [proving] the statement for companion matrices, where the minimal polynomial has only one root. Let $C_p$ be a companion matrix with $p(t) = (t - \lambda)^n$.

(From Linear Algebra by Peter Petersen, page 150, proof of Theorem 25.)

So we further decompose $$ \ker (T - \lambda_i)^{m_i} = C_1 \oplus \dotsb \oplus C_{k(i)} $$ where the $C_j$ are cyclic subspaces. We fix some $j$ and set $C := C_j$ and $n := \dim C$. Since $C$ is cyclic, we find that the characteristic polynomal and minimal polynomial of $T|_C$ coincide (I assume that this has already been shown before); we will refer to this polynomial as $p$. We know that this minimal polynomial $p(t)$ of $T|_C$ divides the minimal polynomial of $T|_{\ker (T - \lambda_i)^{m_i}}$, which is given by $(t - \lambda_i)^{m_i}$. So $p(t)$ is of the form $p(t) = (t - \lambda_i)^{m'_i}$ with $m'_i \leq m_i$. Together with $\deg p = \dim C = n$ we find that $p(t) = (t - \lambda_i)^n$.

We now consider the matrix $$ J := \begin{pmatrix} \lambda & 1 & & \\ & \ddots & \ddots & \\ & & \ddots & 1 \\ & & & \lambda \end{pmatrix} \in \operatorname{M}_n(\mathbb{C}). $$ From an earlier part of the chapter (namely part 4, The Minimal Polynomial, page 120, Proposition 17) we know that the minimal polynomial of $J$ is given by $(t - \lambda)^n = p(t)$.

Since the minimal polynomial of $J$ is of maximal degree it equals its characteristic polynomial; from this is follows that $J$ is similar to the companion matrix its characteristic polynomial $p(t)$, which we will refer to as $C_p$. (Petersen seems to have shown this before, but gives no explicit reference in the proof.) Since the minimal and characteristic polynomial of $T$ coincide, we find that when we represent $T$ with respect to some basis of $C$ by a matrix $A \in \operatorname{M}_n(\mathbb{C})$, then $A$ is also similar to the companion matrix $C_p$. Hence we find that $A$ and $J$ and similar, so there exists a basis of $C$ with respect to which $T|_C$ is represented by $J$.

(There might be some redundancy in the above argumentation.)

Note that we have shown that the decomposition $\ker (T - \lambda_i)^{m_i} = C_1 \oplus \dotsb \oplus C_k$ into cylic subspaces corresponds precisely to the decomposition of $[T|_{\ker (T - \lambda_i)^{m_i}}]$ into Jordan blocks. Since we restrict our attention to single cyclic subspace, we also get only Jordan block.

I have to admit that I find Petersen’s proof somewhat strange:

What he actually does is to construct the Jordan normal form by construction a decomposing $V = \bigoplus_{i=1}^k \bigoplus_{j=1}^{k'(i)} C_{\lambda_i, j}$ into cyclic subspaces $C_{\lambda_i, j}$, and then showing that for each $C_{\lambda_i, j}$ there exists a basis with respect to which $T|_{C_{\lambda_i, j}}$ is given by a matrix $[T|_{C_{\lambda_i, j}}]$ which is a Jordan block. Then he constructs the Jordan-Chevalley decomposition from the Jordan normal form — without ever mentioning the Jordan normal form.

I suppose that this doesn’t help understanding the difference between the two constructions.

Advantages of the Jordan-Chevalley decomposition:

One way to think about the Jordan-Chevalley decomposition is to regard it as a coordinate-free version of the Jordan normal form: To talk about the Jordan normal form of $T$ we need to associate to $T$ a matrix $[T]$, which requires the use of a basis. The Jordan-Chevalley decomposition on the other hand has no such requirements.

What has not been mentioned so far, but is very useful, is that $S$ and $N$ can be expressed as polynomials of $T$, i.e. there exists polynomials $p(t), q(t) \in \mathbb{C}[t]$ with $S = p(T)$ and $N = q(T)$. As far as I know, this has no analogue in terms of the Jordan normal form.

The Jordan-Chevalley decomposition also has the advantage that it generalizes more easily to other settings:

One can generalize the notion of a diagonalizable operator to that of a semisimple operator (if we work over an algebraically closed field then both notions coincide). Then one can also generalize the Jordan-Chevalley decomposition accordingly.
One can generalize the Jordan-Chevalley decomposition to finite-dimensional, semisimple complex Lie algebras: If $\mathfrak{g}$ is such a Lie algebra, then every elemente $x \in \mathfrak{g}$ can be uniquely written as $x = s + n$ where $s, n \in \mathfrak{g}$ are semisimple, resp. nilpotent elements which commute.
One can generalize the additive Jordan-Chevalley decomposition, which we have encountered so far, to the multiplicative Jordan-Chevalley decomposition: Every $T \in \operatorname{GL}_n(\mathbb{C})$ can be a uniquely decomposition as $T = S'U'$ with $S' \in \operatorname{GL}_n(\mathbb{C})$ being diagonalizable and $U' \in \operatorname{GL}_n(\mathbb{C})$ being unipotent. (The additive and multiplicative Jordan-Chevalley decompositions $T = S + N$ and $T = S' U'$ are related by $S = S'$ and $U' = 1 + S^{-1} N$.)

Proof that SN (or Jordan-Chevalley) Decomposition is unique

You don’t need to know that $S$ and $N$ are polynomials in $M$ to prove the uniqueness of the Jordan–Chevalley decomposition.

Let $\lambda_1, \dotsc, \lambda_r$ be the pairwise different eigenvalues of the matrix $S$ and let $V_i$ be the eigenspace of $S$ with respect to the eigenvalue $\lambda_i$. Then $$ \mathbb{C}^n = V_1 \oplus \dotsb \oplus V_r \tag{1} $$ because $S$ is diagonalizable. Each eigenspace $V_i$ is $N$-invariant because the matrices $S$ and $N$ commute. It follows that each eigenspace $V_i$ is $M$-invariant because $M = S + N$.

Let $W_i$ be the generalized eigenspace of $M$ with respect to $\lambda_i$, i.e. $$ W_i = \ker (M - \lambda_i)^m $$ for $m$ sufficiently large. We claim that $W_i = V_i$. Indeed, the matrices $M - \lambda_i I$ and $M - S$ act the same on $V_i$ because $S$ acts on $V_i$ by multiplication with $\lambda_i$. But $M - S = N$ is nilpotent. This means that $M - \lambda_i I$ acts nilpotently on $V_i$, which shows that $V_i \subseteq W_i$. It follows from $(1)$ that already $V_i = W_i$ because the linear subspaces $W_1, \dotsc, W_r$ of $\mathbb{C}^n$ are linearly independent (i.e. their sum is direct).

The generalized eigenspaces $W_i$ are uniquely determined by $M$. We have thus shown that the eigenspaces of $S$ are uniquely determined by $M$. But the matrix $S$ is uniquely determined by its eigenspaces because it is diagonalizable. We have thus shown the uniqueness of $S$. The uniqueness of $N$ now follows from $N = M - S$.

Best Answer

Related Solutions

Linear Algebra – Jordan-Chevalley vs Jordan Normal Decomposition

The connection between the Jordan-Chevalley decomposition and the Jordan normal form:

Regarding the number of Jordan blocks:

Advantages of the Jordan-Chevalley decomposition:

Proof that SN (or Jordan-Chevalley) Decomposition is unique

Related Question