[Math] How to compute amount of floating point operations for LU-decomposition of banded matrix

floating pointgaussian eliminationlu decompositionnumerical methodssparse matrices

I want to compute the amount of floating point operations, flops, needed for the LU-decomposition/factorization of a banded matrix A consisting of 5 nonzero diagonals.

Matrix $A\in\mathbb{R}^{n \times n}$ has nonzero diagonals $-p, -1, 0, 1$ and $p$, thus the highest and lowest bandwidth are $p$ and finally $p$ equals $\sqrt{n}$. In the literature, the amount of flops for a full matrix $A$ is determined as following:
$$
2\sum^{n-1}_{k=1}(n-k)(n-k-1)=2\sum^{n-1}_{l=1}l(l-1)=\frac{2}{3}n^{3}+\mathcal{O}(n^{2})~\mbox{flops}.
$$

Here, the $n-k$ flops for all $n-1$ Gaussian transformations are summed. However, if we know that the maximum bandwidth is $p$ we can take in account only $2p$ flops for the first $n-p$ Gaussian transformations and after that $n-p-1$, $n-p-2$, $…$ flops for the last $p$ Gaussian transformations. This way the above equation can be altered into
$$
2\sum^{n-p}_{k=1}(p)(n-k-1)+2\sum^{p-1}_{k=1}(p-k)(p-k-1)~\mbox{flops}
$$

This would equal (using the first equation for the second summation over $p$)
$$
2p\big(\sum^{n-p}_{k=1}(n-1)-\sum_{k=1}^{n-p}k\big)+\frac{2}{3}p^{3}+\mathcal{O}(p^{2})=2p(n-p)(n-1)-2p(n-p)+\frac{2}{3}p^{3}+\mathcal{O}(p^{2})~\mbox{flops}.
$$

Now, here I get confused, because from the internet I find very simple expressions for the amount of flops for banded matrices with highest and lowest bandwidth equal to $p$, namely $np(4p+3)$ flops, which is even a linear dependency.

Finally, I use that $p=\sqrt{n}$ and find an expression, which is just slightly less dependent on the size of matrix $A\in\mathbb{R}^{n\times n}$
$$
2n^{2}\sqrt{n}+\mathcal{O}(n^{2})~\mbox{flops}.
$$

Now, the expression I find does not at all correspond to other findings on the internet, but I cannot find, where my reasoning is wrong, can anybody help me?

Best Answer

Each pivot (except the last $p$) updates the next $p$ rows, which have lengths $p+2$ to $2p+1$, and each of these elements is updated once, so the number of flops is, for one pivot (counting only fused multiply-adds):

$$\sum_{k=1}^p p+k+1=p^2+\frac{p(p+1)}{2}+p=\frac32p(p+1)$$

The last $k$th pivot for $k\le p$ has shorter rows and less rows to update. It has precisely $k-1$ rows to update, with lengths $k$ each, so the number of flops for each of these pivots is $k(k-1)$

The total number of flops is then

$$\frac23(n-p-1)p(p+1)+\sum_{k=1}^{p}k(k-1)\\ =\frac23(n-p-1)p(p+1)+2\sum_{k=1}^{p}{k\choose 2}\\ =\frac23(n-p-1)p(p+1)+2{p+1\choose 3}\\ =\frac23(n-p-1)p(p+1)+\frac13p(p+1)(p-1)\\ =\frac13p(p+1)(2n-2p-2+p-1)\\ =\frac13p(p+1)(2n-p-3)$$

With $p=\sqrt{n}$, that's $O(n^2)$.

The exact number of flops is actually slightly smaller if you do carefully the first $p$ pivots, which have some rows with zero entries. I have also counted the elements that are zeroed by each pivot (the column below the pivot), and usually they are not computed.

Related Question