Math History – Why Matrices are Multiplied the Way They Are

linear algebramath-historymatrices

Multiplication of matrices — taking the dot product of the $i$th row of the first matrix and the $j$th column of the second to yield the $ij$th entry of the product — is not a very intuitive operation: if you were to ask someone how to mutliply two matrices, he probably would not think of that method. Of course, it turns out to be very useful: matrix multiplication is precisely the operation that represents composition of transformations. But it's not intuitive. So my question is where it came from. Who thought of multiplying matrices in that way, and why? (Was it perhaps multiplication of a matrix and a vector first? If so, who thought of multiplying them in that way, and why?) My question is intact no matter whether matrix multiplication was done this way only after it was used as representation of composition of transformations, or whether, on the contrary, matrix multiplication came first. (Again, I'm not asking about the utility of multiplying matrices as we do: this is clear to me. I'm asking a question about history.)

Best Answer

Matrix multiplication is a symbolic way of substituting one linear change of variables into another one. If $x' = ax + by$ and $y' = cx+dy$, and $x'' = a'x' + b'y'$ and $y'' = c'x' + d'y'$ then we can plug the first pair of formulas into the second to express $x''$ and $y''$ in terms of $x$ and $y$: $$ x'' = a'x' + b'y' = a'(ax + by) + b'(cx+dy) = (a'a + b'c)x + (a'b + b'd)y $$ and $$ y'' = c'x' + d'y' = c'(ax+by) + d'(cx+dy) = (c'a+d'c)x + (c'b+d'd)y. $$ It can be tedious to keep writing the variables, so we use arrays to track the coefficients, with the formulas for $x'$ and $x''$ on the first row and for $y'$ and $y''$ on the second row. The above two linear substitutions coincide with the matrix product $$ \left( \begin{array}{cc} a'&b'\\c'&d' \end{array} \right) \left( \begin{array}{cc} a&b\\c&d \end{array} \right) = \left( \begin{array}{cc} a'a+b'c&a'b+b'd\\c'a+d'c&c'b+d'd \end{array} \right). $$ So matrix multiplication is just a bookkeeping device for systems of linear substitutions plugged into one another (order matters). The formulas are not intuitive, but it's nothing other than the simple idea of combining two linear changes of variables in succession.

Matrix multiplication was first defined explicitly in print by Cayley in 1858, in order to reflect the effect of composition of linear transformations. See paragraph 3 at http://darkwing.uoregon.edu/~vitulli/441.sp04/LinAlgHistory.html. However, the idea of tracking what happens to coefficients when one linear change of variables is substituted into another (which we view as matrix multiplication) goes back further. For instance, the work of number theorists in the early 19th century on binary quadratic forms $ax^2 + bxy + cy^2$ was full of linear changes of variables plugged into each other (especially linear changes of variable that we would recognize as coming from ${\rm SL}_2({\mathbf Z})$). For more on the background, see the paper by Thomas Hawkins on matrix theory in the 1974 ICM. Google "ICM 1974 Thomas Hawkins" and you'll find his paper among the top 3 hits.