[Math] Conceptual explanation of Strassen’s trick for matrix multiplication

algorithmsnoncommutative-rings

Algorithms for fast multiplication of polynomials and integers have well-known conceptual explanations. A good survey paper is Daniel J. Bernstein's Fast Multidigit Multiplication for Mathematicians.

For example, Karatsuba's trick and FFT-based multiplication both fit into the evaluate-and-interpolate scheme. These rely on the fact that evaluation at a point is a ring homomorphism. Karatsuba's trick cheaply evaluates the product of two linear polynomials at 0, 1 and $\infty$ without multiplying them out explicitly and then interpolates the values to get the polynomial product. Similarly, the n-point FFT efficiently evaluates a polynomial at the nth roots of unity, and the inverse FFT efficiently interpolates a polynomial from its values at those same points.

You can also go beyond evaluation and use any convenient ring homomorphisms.

My question is whether Strassen's algorithm for fast matrix multiplication can be explained in conceptual terms along these lines. The core of the algorithm is Strassen's trick for computing products of 2×2 matrices over noncommutative rings. Looking at Strassen's formula, it's hard to shake the feeling that some related approach should work.

Best Answer

To my poor knowledge, fast matrix multiplication is not based on rings homomorphisms, but on the notion of tensorial rank. More precisely, the trilinear form $(A,B,C)\mapsto{\rm Tr}(ABC)$ extends as a linear form $L$ over $$M_{n\times m}\otimes M_{m\times p}\otimes M_{p\times n}.$$

The tensorial rank of $L$ is the minimal number $r$ in which you can write $$L=\sum_{j=1}^re_j\otimes f_j\otimes g_j,$$ with $e,f,g$ linear forms. Then the calculation of $AB$ needs only $r$ multiplication and a certain number of additions. The latter is of little importance, because when you apply the formula recursively to (say square) matrices of size $Nn$, the number of operations grow like $r^N$. This is easily seen in the case $n=2$, where $r=7$ and the formula above involves as many as $16$ additions. For $2^N\times2^N$ matrices, the number of operations is bounded by $C7^N=C(2^N)^\alpha$ with $\alpha=\frac{\log7}{\log2}<3$.

Notice that if you had only two spaces (instead of $3$), then the tensorial rank over $E\otimes F$ is just the rank of linear algebra.

Related Question