Doubt on finding generalized eigenvectors of a matrix

calculuseigenvalues-eigenvectorslinear algebramatrices

I have a system with $A$ matrix given as $$A=\begin{pmatrix}
4 & 1 & -2 \\
1 & 0 & 2 \\
1 & -1 & 3
\end{pmatrix} $$
and i've asked to find the Jordan canonical representation of the given matrix. The first step is to find the eigen-vectors, in this case the eigen- values are $\lambda_1=1,\lambda_2=3,\lambda_3=3$ which necessitates to find a generalized eigen-vector for the eigen-value $3$ . I'm able to find it by following general definition i.e
$$(A-\lambda I)^m V=0$$ and $$(A-\lambda I)^{m-1} V \neq 0$$
But in the book they have followed some other method. First they found $\lambda_1 I-A$ i.e
$$\lambda_1 I-A=\begin{pmatrix}
-3 & -1 & 2 \\
-1 & 1 & -2 \\
-1 & 1 & -2
\end{pmatrix}$$
Then to find the eigen-vectors they calculated the row co-factors of $\lambda_1I-A$ , the co-factors along the first row gives null solution so they found co-factors of the second row i.e $$v_1=\begin{pmatrix}
0 & 8 & 4
\end{pmatrix}^{T} $$

My first doubt is: Why rowcofactors gives the eigen-vectors ? What is the logic behind it ?

Now to find the generalized eigen-vector first they found $$\lambda_2 I-A=\begin{pmatrix}
\lambda_2-4 & -1 & 2 \\
-1 & \lambda_2 & -2 \\
-1 & 1 & \lambda_2-3
\end{pmatrix}$$

Again the row-cofactor, here the first row works out , so
\begin{align}v_2&=\begin{pmatrix}
\lambda_2(\lambda_2-3)+2& \lambda_2-3+2 & -1+\lambda_2
\end{pmatrix}^{T}\bigg{|}_{\lambda_2=3} \\&=\begin{pmatrix}
2 & 2 & 2
\end{pmatrix}^{T}\end{align}

And to complete the basis another eigenvector is found by taking derivative of $v_2$ i.e
\begin{align}v_3 &=\begin{pmatrix}
\frac{d}{d\lambda_2} \left\{ \lambda_2(\lambda_2-3)+2\right\} \\
\frac{d}{d\lambda_2}\left\{ \lambda_2-3+2 \right\} \\
\frac{d}{d\lambda_2}\left\{-1+ \lambda_2 \right\}
\end{pmatrix}\Bigg{|}_{\lambda_2=3}&=\begin{pmatrix}
3 \\
1 \\
1
\end{pmatrix}\end{align}

My second doubt is: Why they are taking derivatives ? how taking derivatives are related with repeated roots

Best Answer

I’d not seen these techniques before, but now that I understand how they work, I’ll be adding them to my repertoire.

The first computation is pretty easy to understand. I’ll work here with the adjugate $\operatorname{adj}(M)$ of the square matrix $M$ (often called the “adjoint” in older sources). This is simply the transpose of the matrix of cofactors. A fundamental property of the adjugate is that $$M\operatorname{adj}(M) = \operatorname{adj}(M) M = \det(M)I.\tag1$$ Now, by construction $\det(\lambda I-A)=0$ when $\lambda$ is an eigenvalue of $A$, so in that case equation (1) says that every column of $\operatorname{adj}(\lambda I-A)$ is an element of the null space of $\lambda I-A$, but this means that every nonzero column of this adjugate matrix is an eigenvector of $A$ with eigenvalue $\lambda$.

The computation that involves differentiation to find generalized eigenvectors is a bit trickier. It looks like it’s covered in a 1967 paper by Dzoković. I don’t have access to the entire paper, but the one page of the preview on the Springer site gives us most of what we need. Following Dzoković, let $f(\lambda)=\det(\lambda I-A)=(\lambda-\lambda_1)^{r_1}(\lambda-\lambda_2)^{r_2}\cdots$ be the characteristic polynomial of $A$ and $B(\lambda)=\operatorname{adj}(\lambda I-A)$, so that by (1), $$(\lambda I-A)B(\lambda) = f(\lambda)I, \tag2$$ which again says that every nonzero column of $B(\lambda_i)$ is an eigenvector of $A$. For a zero column of $B(\lambda_1)$, say the first, Dzoković then goes on to show that if $k_1\lt r_1$ is the highest power of $(\lambda-\lambda_1)$ that divides $B_1(\lambda_1)$, then differentiating it $k_1$ times and setting $\lambda=\lambda_1$ produces $(\lambda_1I-A)B_1^{(k_1)}(\lambda_1)=0$, i.e., differentiating $B_1(\lambda)$ $k_1$ times generates an eigenvector of $A$. He then continues to differentiate: $$(\lambda_1I-A)B_1^{(k_1)}(\lambda_1)=0 \\ (\lambda_1 I-A)B_1^{(k_1+1)}(\lambda_1)+(k_1+1)B_1^{(k_1)}(\lambda_1)=0 \\ \vdots \\ (\lambda_1 I-A)B_1^{(r_1-1)}(\lambda_1)+(r_1-1)B_1^{(r_1-2)}(\lambda_1) = 0.$$ Setting $x_i = \frac1{(k_1+i-1)!}B_1^{(k+i-1)}(\lambda_1),$ this becomes $$(A-\lambda_1 I)x_1=0 \\ (A-\lambda_1 I)x_2=x_1 \\ \vdots \\ (A-\lambda_1 I)x_{r_1-k_1+1} = x_{r_1-k_1},$$ but this is none other than the definition of a chain of generalized eigenvectors for $\lambda_1$.