Computational Complexity – Complexity of Rectangular Matrix Multiplication

algorithmscomputational complexitymatricesnumerical linear algebra

I am interested in the complexity of multiplying two matrices $A$ and $B$, i.e. to compute $AB$.

From [Le Gall and Urrotia], I know that:

  • if $A$ and $B$ are square-matrices of size $n$, then this can be done in $O(n^{\omega})$ where $\omega\approx 2.372$.
  • if $A$ has size $n\times n^{k}$ and $B$ has size $n^k \times n$ this can be done in $O(n^{\omega(k)})$ with $\omega(k)<\omega$ for $k<1$ (typically, $\omega(k)=2$ for $k\le0.3$).

I am wondering if there an efficient algorithm when $A$ has size $n^k\times n$ and $B$ has size $n\times n$ (or when $B$ has size $n\times n^{k}$), for $k\in(0,1)$.

Remarks:

  • when I say "efficient" algorithm, I mean an algorithm that a complexity smaller than the naïve $O(n^{2+k})$ algorithm. I am not refering to an actual implementation, just the existence of an algorithm.
  • I suppose that additing or multiplying two entries of a matrix is $O(1)$.

[Le Gall and Urrotia] Improved Rectangular Matrix Multiplication using Powers of the
Coppersmith-Winograd Tensor by François Le Gall† Florent Urrutia

Best Answer

Assuming that efficient means better than the naive $O(n^{2+k})$ multiplication, let us review some possibilities.

Padding. For $k > \omega-2$, just pad $A$ with $n-n^k$ zero or garbage rows, perform square matrix multiplication in $O(n^\omega)$ time, and discard the extra rows from the output. This is already efficient because $\omega < 2+k$. (For $k < \omega-2$ the naive method would be better.)

Naive splitting. Following Knight (1995), we note that $\langle m,n,p \rangle$ matrix multiplication, with $m$ the smallest of the three dimensions, can be done in $O(m^{\omega-2}np)$ time. This is by splitting $A_{m \times n}$ into $1 \times (n/m)$ blocks of size $m \times m$, and splitting $B_{n \times p}$ into $(n/m) \times (p/m)$ blocks of same size. Then we naively perform the block matrix multiplication of size $\langle 1, \, n/m, \, p/m \rangle$ by performing $np/m^2$ square multiplications of size $m$, each in time $O(m^\omega)$, for a total of $O(m^{\omega-2}np)$. Applying to the current question with input size $\langle n^k, n, n \rangle$ we achieve $O(n^{k(\omega-2)}n^2) = O(n^{2+k(\omega-2)})$. This is efficient and better than padding for all $0<k<1$.

These are just some straightforward transformations that show that "efficient" is possible for the task given. I believe newer developments (Le Gall and others) may be provide more efficient methods.

Tensor powering. One thing to note is that the splitting method above used fast MM inside the square blocks, but naive MM to combine them. In fact one can use fast methods on both levels, and the blocks need not be square. Coppersmith and Winograd (1982) note that if we have two rectangular matrix multiplication methods,

  • $\langle m,n,p \rangle$ in $K$ multiplications and
  • $\langle m',n',p' \rangle$ in $K'$ multiplications,

then we can do

  • $\langle mm', nn', pp' \rangle$ in $KK'$ multiplications.

This is a powerful tool for combining MM methods of different dimensions. I am not sure, but possibly some of Le Gall & Urrutia's "symmetric" $\langle n, n^k, n \rangle$ rectangular methods that the OP cites could be "tensored up" in this way to solve the "asymmetric" $\langle n^k, n, n\rangle$ case for the current question.

Almost quadratic. Coppersmith (1982) reports that if $k = 0.17227$, then we can perform the $\langle n^k, n, n\rangle$ multiplication in $O(n^2 \, (\log n)^2)$ time, that is, almost in quadratic time! Note that the naive splitting would give an exponent of $2 + k(\omega-2) \le 2.0643$, using $\omega \le 2.373$.

Bibliography

Knight, Philip A., Fast rectangular matrix multiplication and QR decomposition, Linear Algebra Appl. 221, 69-81 (1995). ZBL0827.65044.

Coppersmith, D., Rapid multiplication of rectangular matrices, SIAM J. Comput. 11, 467-471 (1982). ZBL0486.68031.

Coppersmith, D.; Winograd, S., On the asymptotic complexity of matrix multiplication, SIAM J. Comput. 11, 472-492 (1982). ZBL0486.68030.

Related Question