Suppose that $A$ and $B$ are matrices that commute. Let $\lambda$ be an eigenvalue for $A$, and let $E_{\lambda}$ be the eigenspace of $A$ corresponding to $\lambda$. Let $\mathbf{v}_1,\ldots,\mathbf{v}_k$ be a basis for $E_{\lambda}$.
I claim that $B$ maps $E_{\lambda}$ to itself; in particular, $B\mathbf{v}_i$ can be expressed as a linear combination of $\mathbf{v}_1,\ldots,\mathbf{v}_k$, for $i=1,\ldots,k$.
To show that $B$ maps $E_{\lambda}$ to itself, it is enough to show that $B\mathbf{v}_i$ lies in $E_{\lambda}$; that is, that if we apply $A$ to $B\mathbf{v}_i$, the result ill be $\lambda(B\mathbf{v}_i)$. This is where the fact that $A$ and $B$ commute comes in. We have:
$$A\Bigl(B\mathbf{v}_i\Bigr) = (AB)\mathbf{v}_i = (BA)\mathbf{v}_i = B\Bigl(A\mathbf{v}_i\Bigr) = B(\lambda\mathbf{v}_i) = \lambda(B\mathbf{v}_i).$$
Therefore, $B\mathbf{v}_i\in E_{\lambda}$, as claimed.
So, now take the basis $\mathbf{v}_1,\ldots,\mathbf{v}_k$, and extend it to a basis for $\mathbf{V}$, $\beta=[\mathbf{v}_1,\ldots,\mathbf{v}_k,\mathbf{v}_{k+1},\ldots,\mathbf{v}_n]$. To find the coordinate matrix of $B$ relative to $\beta$, we compute $B\mathbf{v}_i$ for each $i$, write $B\mathbf{v}_i$ as a linear combination of the vectors in $\beta$, and then place the corresponding coefficients in the $i$th column of the matrix.
When we compute $B\mathbf{v}_1,\ldots,B\mathbf{v}_k$, each of these will lie in $E_{\lambda}$. Therefore, each of these can be expressed as a linear combination of $\mathbf{v}_1,\ldots,\mathbf{v}_k$ (since they form a basis for $E_{\lambda}$. So, to express them as linear combinations of $\beta$, we just add $0$s; we will have:
$$\begin{align*}
B\mathbf{v}_1 &= b_{11}\mathbf{v}_1 + b_{21}\mathbf{v}_2+\cdots+b_{k1}\mathbf{v}_k + 0\mathbf{v}_{k+1}+\cdots + 0\mathbf{v}_n\\
B\mathbf{v}_2 &= b_{12}\mathbf{v}_1 + b_{22}\mathbf{v}_2 + \cdots +b_{k2}\mathbf{v}_k + 0\mathbf{v}_{k+1}+\cdots + 0\mathbf{v}_n\\
&\vdots\\
B\mathbf{v}_k &= b_{1k}\mathbf{v}_1 + b_{2k}\mathbf{v}_2 + \cdots + b_{kk}\mathbf{v}_k + 0\mathbf{v}_{k+1}+\cdots + 0\mathbf{v}_n
\end{align*}$$
where $b_{ij}$ are some scalars (some possibly equal to $0$). So the matrix of $B$ relative to $\beta$ would start off something like:
$$\left(\begin{array}{ccccccc}
b_{11} & b_{12} & \cdots & b_{1k} & * & \cdots & *\\
b_{21} & b_{22} & \cdots & b_{2k} & * & \cdots & *\\
\vdots & \vdots & \ddots & \vdots & \vdots & \ddots & \vdots\\
b_{k1} & b_{k2} & \cdots & b_{kk} & * & \cdots & *\\
0 & 0 & \cdots & 0 & * & \cdots & *\\
\vdots & \vdots & \ddots & \vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & 0 & * & \cdots & *
\end{array}\right).$$
So, now suppose that you have a basis for $\mathbf{V}$ that consists entirely of eigenvectors of $A$; let $\beta=[\mathbf{v}_1,\ldots,\mathbf{v}_n]$ be this basis, with $\mathbf{v}_1,\ldots,\mathbf{v}_{m_1}$ corresponding to $\lambda_1$ (with $m_1$ the algebraic multiplicity of $\lambda_1$, which equals the geometric multiplicity of $\lambda_1$); $\mathbf{v}_{m_1+1},\ldots,\mathbf{v}_{m_1+m_2}$ the eigenvectors corresponding to $\lambda_2$, and so on until we get to $\mathbf{v}_{m_1+\cdots+m_{k-1}+1},\ldots,\mathbf{v}_{m_1+\cdots+m_k}$ corresponding to $\lambda_k$. Note that $\mathbf{v}_{1},\ldots,\mathbf{v}_{m_1}$ are a basis for $E_{\lambda_1}$; that $\mathbf{v}_{m_1+1},\ldots,\mathbf{v}_{m_1+m_2}$ are a basis for $E_{\lambda_2}$, etc.
By what we just saw, each of $B\mathbf{v}_1,\ldots,B\mathbf{v}_{m_1}$ lies in $E_{\lambda_1}$, and so when we express it as a linear combination of vectors in $\beta$, the only vectors with nonzero coefficients are $\mathbf{v}_1,\ldots,\mathbf{v}_{m_1}$, because they are a basis for $E_{\lambda_1}$. So in the first $m_1$ columns of $[B]_{\beta}^{\beta}$ (the coordinate matrix of $B$ relative to $\beta$), the only nonzero entries in the first $m_1$ columns occur in the first $m_1$ rows.
Likewise, each of $B\mathbf{v}_{m_1+1},\ldots,B\mathbf{v}_{m_1+m_2}$ lies in $E_{\lambda_2}$, so when we express them as linear combinations of $\beta$, the only places where you can have nonzero coefficients are in the coefficients of $\mathbf{v}_{m_1+1},\ldots,\mathbf{v}_{m_1+m_2}$. So the $(m_1+1)$st through $(m_1+m_2)$st column of $[B]_{\beta}^{\beta}$ can only have nonzero entries in the $(m_1+1)$st through $(m_1+m_2)$st rows. And so on.
That means that $[B]_{\beta}^{\beta}$ is in fact block-diagonal, with the blocks corresponding to the eigenspaces $E_{\lambda_i}$ of $A$, exactly as described.
I will answer your question just for the cases $N = 2$ and $N = 3$:
Let
$$ B_2 = \left(
\begin{array}{cccc}
0 & 1 & 1 & 0 \\
1 & 0 & 0 & 1 \\
1 & 0 & 0 & 1 \\
0 & 1 & 1 & 0
\end{array}
\right), \quad B_3 = \left(
\begin{array}{cccccc}
0 & 1 & 1 & 1 & 1 & 0 \\
1 & 0 & 1 & 1 & 0 & 1 \\
1 & 1 & 0 & 0 & 1 & 1 \\
1 & 1 & 0 & 0 & 1 & 1 \\
1 & 0 & 1 & 1 & 0 & 1 \\
0 & 1 & 1 & 1 & 1 & 0
\end{array}
\right), $$
then, $\text{Spec}{(B_2)} = {(-2,2,0,0)} \, $ and $\text{Spec}{(B_3)} = (-2,-2,0,0,0,4). $
With the help of numerics, I've been able to show (at least for sufficiently large values of $N$) that the characteristic polynomial is given by:
$$\color{blue}{p(\lambda) =\det{(B_N - \lambda I_{2N})} = (\lambda - 2N +2)(\lambda+2)^{N-1} \lambda^N }$$
which tells you that the only eigenvalues of this kind of matrices are $-2,0,2N-2 \ $ with the corresponding multiplicities given by $p(\lambda)$.
Here is an animation showing the spectrum of the matrices $B_N$ for $N \in (2,30)$:
Here's the same approach in the case we have the $B_N$ matrices defined as:
$$B_N = \begin{bmatrix} C_N & A_N \\ A_N & C_N \end{bmatrix},$$
then:
$$\color{blue}{p(\lambda) =\det{(B_N - \lambda I_{2N})} = \left\{ \begin{array}{ll}
\left(\lambda - 2N + 2\right) (\lambda-2)^{N/2}(\lambda+2)^{(N-2)/2} \lambda^{N} & N \text{ even} \\
\left(\lambda - 2N + 2\right) (\lambda-2)^{(N-1)/2}(\lambda+2)^{(N-1)/2} \lambda^{N} & N \text{ odd}
\end{array}\right.}$$
which tells you that the only eigenvalues of this kind of matrices are $-2,0,2,2N-2 \ $ with the corresponding multiplicities given by $p(\lambda)$.
Here is another animation showing the spectrum of the matrices $B_N$ for $N \in (2,30)$:
pretty cool!
Hope somebody can shed some light on these results.
Cheers!
Best Answer
First, let's review the problem where $A,B$ are merely scalars. Then we can express our matrix as $$M=(a-b)I_N+buu^\top$$ where $u$ is a vector of 1s. We take advantage of this by noting that the eigenbasis of the rank-one matrix $uu^\top$ is simply $u_1:=u$ itself (with eigenvalue $N$) and the $N-1$ eigenvectors $u_2,u_3,\ldots u_N$ where $u_k:=e_{k}-e_{1}$ (with eigenvalue $0$). Note that that the latter vectors are all orthogonal to $u$.
To make this more explicit, we introduce $$U=\begin{pmatrix} u_1 & u_2 & \cdots & u_N\end{pmatrix} = \begin{pmatrix} 1 & -1 & -1 & \cdots & -1 \\ 1 & 1 & 0 &\cdots & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & 0 & 0 & \cdots & 1 \end{pmatrix}$$ and observe that $$uu^\top U=\begin{pmatrix} u(u^\top u_1) & u(u^\top u_2)& \cdots& u(u^\top u_N)\end{pmatrix}=\begin{pmatrix} Nu & 0 & \cdots & 0\end{pmatrix}=U \Lambda$$ where $\Lambda=\operatorname{diag}(N,0,\ldots,0)$. Hence $uu^\top$ is diagonalized as $uu^\top=U^{-1}\Lambda U$, and therefore $M$ is as well:
\begin{align} UMU^{-1} &=(a-b)I_N+ b U^{-1}uu^\top U \\ &=(a-b)I_N+b U^{-1}U\Lambda \\ &= (a-b)I_N+N b \Lambda\\ &= \operatorname{diag}(a+(N-1)b,a-b,\ldots, a-b) \end{align} in agreement with Mathematica's proposition.
That's fine for the case of scalar $A,B$, but what about 2-by-2 matrices? To deal with this, we write $M$ using the Kronecker product $\otimes$ as $$M=I_N\otimes (A-B)+ uu^\top\otimes B.$$ We can now diagonalize $M$ as before, but now using the matrix
$$U\otimes I_2= \begin{pmatrix} I_2 & -I_2 & -I_2 & \cdots & -I_2 \\ I_2 & I_2 & 0_2 &\cdots & 0_2\\ \vdots & \vdots & \vdots & \ddots & \vdots \\ I_2 & 0_2 & 0_2 & \cdots & I_2 \end{pmatrix}$$
where $0_2$ is a 2-by-2 matrix of zeros. We have
\begin{align} (U\otimes I_2)M(U\otimes I_2)^{-1} &=(U \otimes I_2)\Big[I_N\otimes (A-B)+uu^\top \otimes B\Big ](U^{-1} \otimes I_2)\\ &=(UI_N U^{-1})\otimes(I_2(A-B)I_2)+(Uu^\top U^{-1})\otimes(I_2 B I_2)\\ &=I_N\otimes (A-B)+\Lambda \otimes B \end{align} which is exactly the block-diagonal form given by Mathematica. So we can indeed block-diagonalize $M$ in the way proposed by Mathematica, in a way that's entirely analogous to the construction for scalar $A,B$. (Indeed, it's straightforward to do this for square matrices $A,B$ of arbitrary dimension.)
A closing observation: While the first column of $U$ was determined explicitly (up to a multiplicative constant), the remaining columns were chosen to have $N-1$ vectors which were orthogonal to $u$. These were far from unique: Not only would any permutation suffice, but we could also have picked for instance the vectors $$e_1-e_2,e_2-e_3,\ldots,e_N-e_{N-1}.$$ As such, there's no one way to diagonalize $M$ in the desired form. The form chosen above is simply what seemed easiest to write out explicitly.