Linear Algebra – Why Diagonalization of Matrix B with Basis of Commuting Matrix A Gives Block Diagonal Matrix

block matricesdiagonalizationeigenvalues-eigenvectorslinear algebramatrices

I am trying to understand a proof concerning commuting matrices and simultaneous diagonalization of these.

It seems to be a well known result that when you take the eigenvectors of $A$ as a basis and diagonalize $B$ with it then you get a block diagonal matrix:

$$B=
\begin{pmatrix}
B_{1} & 0 & \cdots & 0 \\
0 & B_{2} & \cdots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & B_{m}
\end{pmatrix},$$

where each $B_{i}$ is an $m_{g}(\lambda_{i}) \times m_{g}(\lambda_{i})$ block ($m_{g}(\lambda_{i})$ being the geometric multiplicity of $\lambda_{i}$).

My question
Why is this so? I calculated an example and, lo and behold, it really works 🙂 But I don't understand how it works out so neatly.

Can you please explain this result to me in an intuitive and step-by-step manner – Thank you!

Best Answer

Suppose that $A$ and $B$ are matrices that commute. Let $\lambda$ be an eigenvalue for $A$, and let $E_{\lambda}$ be the eigenspace of $A$ corresponding to $\lambda$. Let $\mathbf{v}_1,\ldots,\mathbf{v}_k$ be a basis for $E_{\lambda}$.

I claim that $B$ maps $E_{\lambda}$ to itself; in particular, $B\mathbf{v}_i$ can be expressed as a linear combination of $\mathbf{v}_1,\ldots,\mathbf{v}_k$, for $i=1,\ldots,k$.

To show that $B$ maps $E_{\lambda}$ to itself, it is enough to show that $B\mathbf{v}_i$ lies in $E_{\lambda}$; that is, that if we apply $A$ to $B\mathbf{v}_i$, the result ill be $\lambda(B\mathbf{v}_i)$. This is where the fact that $A$ and $B$ commute comes in. We have: $$A\Bigl(B\mathbf{v}_i\Bigr) = (AB)\mathbf{v}_i = (BA)\mathbf{v}_i = B\Bigl(A\mathbf{v}_i\Bigr) = B(\lambda\mathbf{v}_i) = \lambda(B\mathbf{v}_i).$$ Therefore, $B\mathbf{v}_i\in E_{\lambda}$, as claimed.

So, now take the basis $\mathbf{v}_1,\ldots,\mathbf{v}_k$, and extend it to a basis for $\mathbf{V}$, $\beta=[\mathbf{v}_1,\ldots,\mathbf{v}_k,\mathbf{v}_{k+1},\ldots,\mathbf{v}_n]$. To find the coordinate matrix of $B$ relative to $\beta$, we compute $B\mathbf{v}_i$ for each $i$, write $B\mathbf{v}_i$ as a linear combination of the vectors in $\beta$, and then place the corresponding coefficients in the $i$th column of the matrix.

When we compute $B\mathbf{v}_1,\ldots,B\mathbf{v}_k$, each of these will lie in $E_{\lambda}$. Therefore, each of these can be expressed as a linear combination of $\mathbf{v}_1,\ldots,\mathbf{v}_k$ (since they form a basis for $E_{\lambda}$. So, to express them as linear combinations of $\beta$, we just add $0$s; we will have: $$\begin{align*} B\mathbf{v}_1 &= b_{11}\mathbf{v}_1 + b_{21}\mathbf{v}_2+\cdots+b_{k1}\mathbf{v}_k + 0\mathbf{v}_{k+1}+\cdots + 0\mathbf{v}_n\\ B\mathbf{v}_2 &= b_{12}\mathbf{v}_1 + b_{22}\mathbf{v}_2 + \cdots +b_{k2}\mathbf{v}_k + 0\mathbf{v}_{k+1}+\cdots + 0\mathbf{v}_n\\ &\vdots\\ B\mathbf{v}_k &= b_{1k}\mathbf{v}_1 + b_{2k}\mathbf{v}_2 + \cdots + b_{kk}\mathbf{v}_k + 0\mathbf{v}_{k+1}+\cdots + 0\mathbf{v}_n \end{align*}$$ where $b_{ij}$ are some scalars (some possibly equal to $0$). So the matrix of $B$ relative to $\beta$ would start off something like: $$\left(\begin{array}{ccccccc} b_{11} & b_{12} & \cdots & b_{1k} & * & \cdots & *\\ b_{21} & b_{22} & \cdots & b_{2k} & * & \cdots & *\\ \vdots & \vdots & \ddots & \vdots & \vdots & \ddots & \vdots\\ b_{k1} & b_{k2} & \cdots & b_{kk} & * & \cdots & *\\ 0 & 0 & \cdots & 0 & * & \cdots & *\\ \vdots & \vdots & \ddots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 0 & * & \cdots & * \end{array}\right).$$

So, now suppose that you have a basis for $\mathbf{V}$ that consists entirely of eigenvectors of $A$; let $\beta=[\mathbf{v}_1,\ldots,\mathbf{v}_n]$ be this basis, with $\mathbf{v}_1,\ldots,\mathbf{v}_{m_1}$ corresponding to $\lambda_1$ (with $m_1$ the algebraic multiplicity of $\lambda_1$, which equals the geometric multiplicity of $\lambda_1$); $\mathbf{v}_{m_1+1},\ldots,\mathbf{v}_{m_1+m_2}$ the eigenvectors corresponding to $\lambda_2$, and so on until we get to $\mathbf{v}_{m_1+\cdots+m_{k-1}+1},\ldots,\mathbf{v}_{m_1+\cdots+m_k}$ corresponding to $\lambda_k$. Note that $\mathbf{v}_{1},\ldots,\mathbf{v}_{m_1}$ are a basis for $E_{\lambda_1}$; that $\mathbf{v}_{m_1+1},\ldots,\mathbf{v}_{m_1+m_2}$ are a basis for $E_{\lambda_2}$, etc.

By what we just saw, each of $B\mathbf{v}_1,\ldots,B\mathbf{v}_{m_1}$ lies in $E_{\lambda_1}$, and so when we express it as a linear combination of vectors in $\beta$, the only vectors with nonzero coefficients are $\mathbf{v}_1,\ldots,\mathbf{v}_{m_1}$, because they are a basis for $E_{\lambda_1}$. So in the first $m_1$ columns of $[B]_{\beta}^{\beta}$ (the coordinate matrix of $B$ relative to $\beta$), the only nonzero entries in the first $m_1$ columns occur in the first $m_1$ rows.

Likewise, each of $B\mathbf{v}_{m_1+1},\ldots,B\mathbf{v}_{m_1+m_2}$ lies in $E_{\lambda_2}$, so when we express them as linear combinations of $\beta$, the only places where you can have nonzero coefficients are in the coefficients of $\mathbf{v}_{m_1+1},\ldots,\mathbf{v}_{m_1+m_2}$. So the $(m_1+1)$st through $(m_1+m_2)$st column of $[B]_{\beta}^{\beta}$ can only have nonzero entries in the $(m_1+1)$st through $(m_1+m_2)$st rows. And so on.

That means that $[B]_{\beta}^{\beta}$ is in fact block-diagonal, with the blocks corresponding to the eigenspaces $E_{\lambda_i}$ of $A$, exactly as described.

Related Question