A numerical note in my linear algebra text states the following: "In general, the forward phase of row reduction takes much longer than the backward phase. An algorithm for solving a system is usually measured in flops (or floating point operations). A flop is one arithmetic operation $(+,-,*,/)$ on two real floating point numbers. For an $n\times(n+1)$ matrix, the reduction to echelon form can take
$$
\frac{2}{3}n^3+\frac{1}{2}n^2-\frac{7}{6}n\tag{1}
$$
flops (which is approximately $2n^3/3$ flops when $n$ is moderately large–say, $n\geq 30$). In contrast, further reduction to reduced echelon form needs at most $n^2$ flops."
There is no explanation at all how the author came up with the expression in $(1)$. There is obviously a reason for that expression and the $n^2$ mentioned at the end of the quote. Can someone with knowledge of linear algebra or computer architecture, etc., explain why these expressions are the way there? It seems like the author just pulled them out of nowhere.
Best Answer
The operations involved at the $k$th step of the transformation to REF, where $k=1,\ldots,n-1$, are: for each row $l=k+1,\ldots,n$,
(We do Item 1 and Item 2 for rows $k+1,\ldots,n$, so the numbers for both items must be multiplied by $n-k$.) This gives the number of flops at step $k$ $$ (n-k)[1+2(n-k+1)]=2k^2-(4n+3)k+n(2n+3) $$ and the total number of flops $$ \sum_{k=1}^{n-1}[2k^2-(4n+3)k+n(2n+3)]=\frac{2}{3}n^3+\frac{1}{2}n^2-\frac{7}{6}n. $$
The operations involved at the $k$th step of the transformation of REF to RREF, where $k=n,n-1,\ldots,1$, are:
Hence, for the step $k$, we have $$ 1+2(k-1)=2k-1 $$ operations and total $$ \sum_{k=1}^n(2k-1)=n^2. $$
NOTE It is useful to recall the sums $$ \sum_{k=1}^n k^2 = \frac{1}{6}n(n+1)(2n+1), \quad \sum_{k=1}^n k = \frac{1}{2}n(n+1). $$