[Math] Fastest way to solve linear system with block symmetric banded/Toeplitz matrix

linear algebramatricesnumerical linear algebranumerical methods

I have a matrix of the following form:

enter image description here

The size of the matrix may grow to be large, but the general pattern of being blockwise symmetric and banded (with 5 bands) will always hold. What is the fastest way to go about solving a system with a matrix like this?

Edit: Based on Carl's suggestion, I treated the matrix as a $2\times2$ five-banded block-diagonal matrix:

$$
A = \begin{pmatrix}A_1 & 0 \\ 0 & A_2 \end{pmatrix} \,.
$$

Applying the reverse symmetric Cuthill-McKee algorithm reorganizes each banded submatrix into a block-diagonal matrix, each with bandwidth 5 and dense within the bands. Example for $A_1$ from the above matrix:

block diagonal banded submatrix

I'm not familiar with solving banded matrices efficiently. It seems like there is now a lot of structure here that I can leverage, but I'm not entirely sure how.

Best Answer

For each diagonal block $A$, it seems that you have a fixed small distance between your nonzero diagonals. You have a nonzero diagonal, a n nonzero 5th super/subdiagonal and a nonzero 10th super/subdiagonal. The fact that the spacing between the nonzero diagonals is critial.

You will be able to do a symmetric permutation $PAP^T$ of each diagonal block $A$ into a block diagonal matrix with many small blocks on the main diagonal. Here $P$ is a permutation matrix. These small blocks will be banded and dense within the band. I recommend the symmetric reverse Cuthill-McKee algorithm for this kind of problem. There is an implementation in MATLAB (symrcm) and there are free C implementations available online.

At first it may seem astonishing that your diagonal blocks decouple into smaller disjoint blocks. To gain an understanding of this, I recommend that you first consider a 10 by 10 matrix with a nonzero diagonal, a nonzero 2nd superdiagonal and a nonzero 2nd subdiagonal. Applying the permutation $q = (9,7,5,3,1,10,8,6,4,2)$ to the rows and columns will give you two diagonal blocks of dimension 5 which are tridiagonal. The permutation is written in MATLAB format and you should interpret it as follows: put row 9 as row 1, put row 7 as row 2, put row 5 as row 3, etc. and similarly for the columns.

This is not a bad place to start (as it will dramatically simplify your problem and expose massive parallelism and give you small banded systems to solve), but you may want to look into graph reorderings and the elimination game for sparse solvers in general.

I hope this helps.

EDIT: Once you have obtained small banded diagonal blocks, then we can begin to discuss which algorithm is the fastest as it depends very much on the system which you are programming for, but obtaining good data locality is critical on any architecture. It is possible that your blocks will be so small that you can not do vector operations (SIMD) in the usual manner and that you will have to interleave the data representing different blocks, but that is doable.

Related Question