One optimization problem in matrix multiplication and related recurrence

algorithmscomputer sciencedynamic programmingmatricesoptimization

We want to calculate $A_1 \times A_2 \times \cdots \times A_n$, where $A_i$ has dimensions $d_{i-1} \times d_i$.

In the classical matrix chain multiplication problem, we wish to minimize the total number of scalar multiplications. In this problem, we consider the all intermediate matrices arising in the computation (including the final result but excluding the original matrices), and the cost of a specific order is the maximal number of entries of such an intermediate matrix. As usual, we want to minimize the cost.

As an example, if $n = 2$ the answer is simply $d_0d_2$, and if $n = 3$, then there are two orders:

  • $(A_1A_2)A_3$, in which the intermediate matrices are $A_1A_2,A_1A_2A_3$. The cost is therefore $\max(d_0d_2,d_0d_3)$.
  • $A_1(A_2A_3)$, in which the intermediate matrices are $A_2A_3,A_1A_2A_3$. The cost is therefore $\max(d_1d_3,d_0d_3)$.

Denote by $C_{ij}$ the optimal cost of multiplying $A_i,\ldots,A_j$. We can write the following recurrence for $C_{ij}$:

$$C_{ij}=\min_{i \leq k <j} \max \{C_{i,k},C_{k+1,j},d_{i-1}d_j\},$$

with base case $C_{ii} = 0$.

How do we get this recurrence?

Best Answer

The recurrence arises by considering the product $A_i \dots A_k$ that includes $A_i$. If you compute $A_i \dots A_j$ as $(A_i \dots A_k)(A_{k+1} \dots A_j)$, then the first product has cost $C_{i,k}$, the second product has cost $C_{k+1,j}$, and the cost of multiplying the resulting $d_{i-1} \times d_k$ and $d_k \times d_j$ matrices is $d_{i-1} d_j$. The maximum number of nonzero entries from this choice of $k$ is $\max(C_{i,k},C_{k+1,j},d_{i-1} d_j)$.

Because you are minimizing, choose the $k\in\{i,\dots,j-1\}$ that minimizes this cost: $$C_{i,j}=\min_{i \le k < j}\max(C_{i,k},C_{k+1,j},d_{i-1} d_j)$$

Note that the third argument of the $\max$ does not depend on $k$, so you could rewrite the recurrence as $$C_{i,j}=\max\left(\min_{i \le k < j}\max(C_{i,k},C_{k+1,j}),d_{i-1} d_j\right)$$

The $j=i$ case requires no multiplication, so $C_{i,i}=0$. You want to compute $C_{1,n}$.

For example, when $n=2$, you have $$C_{1,n} = C_{1,2} = \min_{1 \le k < 2} \max(C_{1,k},C_{k+1,2},d_0 d_2) = \max(C_{1,1},C_{2,2},d_0 d_2)=d_0 d_2.$$

For $n=3$, \begin{align} C_{1,3} &= \min_{1 \le k < 3} \max(C_{1,k},C_{k+1,3},d_0 d_3) \\ &= \min(\max(C_{1,1},C_{2,3},d_0 d_3),\max(C_{1,2},C_{3,3},d_0 d_3)) \\ &= \min(\max(0,C_{2,3},d_0 d_3),\max(d_0 d_2,0,d_0 d_3)) \\ &= \min(\max(C_{2,3},d_0 d_3),\max(d_0 d_2,d_0 d_3)) \\ \end{align} You would need to compute $C_{2,3}$ to finish the computation.