[Math] Why are there multiple Jordan Blocks corresponding to the same eigenvalue

eigenvalues-eigenvectorsjordan-normal-formlinear algebramatrices

Though the title seems clear enough, I'd like to start with a discussion of how I personally came to derive the Jordan Normal Form, because my question is very specific to the details of my derivation.

Notation

To start, let $X$ be a finite dimensional vector space, $L(X)$ be the space of linear operators on $X$, and $A\in L(X)$. Let $\sigma(A) = \{\lambda_1,\ \cdots,\ \lambda_k\}$ be the spectrum of $A$. Now, we define

  • $d(\lambda)$ to be the geometric multiplicity of $\lambda$
  • $m(\lambda)$ to be the algebraic multiplicity of $\lambda$

Next, we denote the $k$th generalized eigenspace of $\lambda$ by
$$
\text{N}_k(\lambda) = \text{Ker}(A-\lambda I)^k
$$
and finally, we let
$$
\text{N}(\lambda) = N_{n(\lambda)}(\lambda)\qquad n(\lambda)=\min\{k\in\mathbb{N}\ |\ \text{N}_k(\lambda)=N_{k+1}(\lambda)\}
$$
we note that it can be shown that $n(\lambda) = m(\lambda)$, and so the notation $n(\lambda)$ won't really be used.

We will also let $\sum_\lambda$, $\prod_\lambda$, etc. represent the sum/product/etc. over distinct eigenvalues of $A$.

Fundamentals

First off, it is known that we can decompose $X$ as
$$
X = \text{N}(\lambda_1)\oplus\cdots\oplus\text{N}(\lambda_k)
$$
Hence $\sum_{\lambda} \dim\ \text{N}(\lambda) = \dim X$. Also, from the characteristic polynomial of $A$, the sum of the algebraic multiplicities of the eigenvalues must equal the degree of the polynomial, which is $\dim X$. Thus
$$
\sum_\lambda\dim\ \text{N}(\lambda) = \sum_\lambda m(\lambda) = \dim X
$$
Going in a different direction, we present the following theorem:

Theorem: If $B\in L(X)$ is nilpotent of order $n$, and $S\subset X\backslash\text{Ker} B^{n-1}$ is linearly independent, then
$$
\bigcup_{x\in S}\{x,\ Bx,\ B^2x,\ \cdots,\ B^{n-1}x\}
$$
is linearly independent.

Proof: We will show the case for $|S|=2$, and the general case follows the same format. Suppose $S = \{x,\ y\}$, and
$$
\sum_{k=0}^{n-1} a_k B^kx_1 + \sum_{k=0}^{n-1}b_k B^kx_2 = 0
$$
applying $B^{n-1}$ to both sides gives
$$
B^{n-1}\left(\sum_{k=0}^{n-1}a_kB^kx_1+b_kB^kx_2\right) = a_0B^{n-1}x_1+b_0B^{n-1}x_2 = B^{n-1}(a_0x_1+b_0x_2) = 0
$$
so $a_0x_1 + b_0x_2\in\text{Ker}B^{n-1}$. However, since $\text{Ker}B^{n-1}$ is a subspace of $X$, we can decompose $X$ as $X = \text{Ker}B^{n-1}\oplus Z$ for some vector space $Z$, for which $\{x_1,\ x_2\}\subset Z\backslash\{0\}$. Since $Z$ is a subspace, $a_0x_1+b_0x_2\in Z$. To say that $a_0x_1+b_0x_2\in \text{Ker}B^{n-1}\cap Z$ is equivalent to saying $a_0x_1+b_0x_2 = 0$. By linear independence of $S$, $a_0=b_0=0$. This process can be repeated to get $a_j=b_j=0$ for all $j$. $\blacksquare$

Now, take $x\in \text{N}(\lambda)\backslash \text{N}_{m(\lambda)-1}(\lambda)$. Note that $B_\lambda = (A – \lambda I)|_{\text{N}(\lambda)}$ (that is, $A – \lambda I$ restricted to $\text{N}(\lambda)$) is nilpotent of order $m(\lambda)$. Hence $\{x,\ B_\lambda x,\ \cdots,\ B_\lambda^{m(\lambda)-1}x\}$ is linearly independent, and it's span is a subspace of $\text{N}(\lambda)$. Hence $\dim \text{N}(\lambda) \ge m(\lambda)$.

If we suppose that $\dim\text{N}(\lambda) > m(\lambda)$ for at least one $\lambda\in\sigma(A)$, then we contradict the fact that $\sum_\lambda\dim\text{N}(\lambda) = \dim X$, and so we conclude that $m(\lambda) = \dim\text{N}(\lambda)$.

Alright, so far so good I hope…

Jordan Normal Form

By the above arguments, we conclude that $\text{Span}\{x,\ \cdots,\ B^{m(\lambda)-1}x\} = \text{N}(\lambda)$. Hence, if we let $e_0(\lambda)\in N(\lambda)\backslash N_{m(\lambda)-1}(\lambda)$, and $e_k(\lambda)=(A-\lambda I)^k e_0(\lambda)$, then
$$
\text{Span}\left(\bigcup_{\lambda}\bigcup_{k=0}^{m(\lambda)-1}\{e_k(\lambda)\}\right) = X
$$

Since $X = \text{N}(\lambda_1)\oplus\cdots\oplus\text{N}(\lambda_k)$, and each $\text{N}(\lambda_k)$ is $A$-invariant (that is $A(\text{N}(\lambda_k))\subseteq \text{N}(\lambda_k)$), it follows that if we have bases for each $N(\lambda_i)$, then we can get the following matrix representation of $A$ wrt the union of these bases:
$$
A = \left[\begin{matrix}
A|_{\text{N}(\lambda_1)} & O & \cdots & \vdots \\
O & A|_{\text{N}(\lambda_2)} & \cdots & \vdots \\
\vdots & \vdots & \ddots & \vdots \\
\cdots & \cdots & \cdots & A|_{\text{N}(\lambda_k)}
\end{matrix}\right]
$$
where $A|_{\text{N}(\lambda_i)}$ is the matrix representation of $A$ restricted to $\text{N}(\lambda_i)$ wrt the basis of $\text{N}(\lambda_i)$.

Above, we demonstrated that $\{e_{m(\lambda)-1}(\lambda),\ \cdots,\ e_1(\lambda)\}$ is a basis for $\text{N}(\lambda)$. We can find a matrix representation for $A|_{\text{N}(\lambda_i)}$ by noting that
$$
Ae_k(\lambda) = A(A-\lambda I)^ke_1(\lambda) = (A-\lambda I)^{k+1}e_1(\lambda) + \lambda(A-\lambda I)^ke_1(\lambda) \\
Ae_k(\lambda) = e_{k+1}(\lambda)+\lambda e_k(\lambda) \\
Ae_{m(\lambda)-1}(\lambda) = \lambda e_{m(\lambda)-1}(\lambda)
$$
and so
$$
A|_{N(\lambda)} = \left[\begin{matrix}
\lambda & 1 & 0 & \cdots & 0 \\
0 & \lambda & 1 & \cdots & 0 \\
0 & 0 & \lambda & \cdots & 0 \\
\vdots & \vdots & \vdots & \ddots & 1 \\
0 & 0 & 0 & \cdots & \lambda
\end{matrix}\right]
$$

These $A|_{N(\lambda)}$ are the Jordan Blocks, and the matrix representation of $A$ above is the Jordan Normal Form.

Main Question

I'm pretty content with this derivation, nothing seems confusing or out of place or contradictory or nonrigorous, at least at a surface level. I would not be asking this question if I didn't go to the Wikipedia page on the Jordan Normal Form and see this line:

The number of Jordan blocks corresponding to $\lambda$ of size at least $j$ is $\dim \text{Ker}(A – \lambda I)^j – \dim \text{Ker}(A – \lambda I)^{j-1}$.

My "derivation" doesn't account for the fact that there can be multiple Jordan Blocks corresponding to the same eigenvalue. So, in the broadest sense possible, why? What don't I account for?

My idea was that I "assumed" that $\text{Span}\{x,\ \cdots,\ \text{B}_\lambda^{m(\lambda)-1}x\} = \text{N}(\lambda)$. If there are more elements in the basis for $\text{N}(\lambda)$ than this, then there are more Jordan blocks. But if $\text{N}(\lambda)>m(\lambda)$, then the decomposition of $X$ into the direct sum of generalized eigenspaces fails, since the dimensions don't add up. My only other guess is that $\{x,\ \cdots,\ \text{B}_\lambda^{m(\lambda)-1}x\}$ can be "broken down" in some sense into the union of smaller bases which then produce more Jordan blocks, but I can't quite see where to go with that.

Any help would be appreciated. Thank you for your time!

Best Answer

There are several mistakes in what you wrote but the critical mistake is that you claim that $m(\lambda) = n(\lambda)$. It is not true that the index at which $\ker (A - \lambda I)^k$ stabilizes is the algebraic multiplicity of $\lambda$. For example, consider the nilpotent matrix

$$ A = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}. $$

The characteristic polynomial of $A$ is $x^3$ so $m(0) = 3$ while we have $A^2 = 0$ so $n(0) = 2$. This is the phenomenon which causes the appearance of several Jordan blocks because if you pick $x \in \mathbb{R^3} \setminus \ker(A)$ (for example $x = e_2$) then $\{ x, Ax \}$ will be linearly independent but $A^2x = 0$ so you don't have enough vectors to form a basis and you'll need to adjoin another block.