[Math] What does matrix multiplication have to do with scalar multiplication

faqlinear algebramath-historymatricesnotation

Why are matrix and scalar multiplication denoted the same way and treated as the same operation in standard mathematical notation? This is always a source of confusion for me because they have completely different properties (specifically commutativity). Multiplying a 1×1 matrix by an NxN matrix isn't even generally equivalent to multiplying an NxN matrix by a scalar. (The former is not even always defined.) Wouldn't it be clearer to consider these to be completely unrelated operations and use completely different notation to represent them?

Best Answer

The product of matrices is defined so that it corresponds to the composition of the corresponding linear maps. One can derive the usual formula for matrix multiplication from this fact alone. This should be covered in every good linear algebra textbook, e.g. Axler's Linear Algebra Done Right. $\:$ See also Arturo Magidin's answer here. So your question reduces to why composition of maps is denoted the same as multiplication. One answer is that rings arise naturally as subrings of linear maps on their underlying additive groups (left regular representation). This is a ring-theoretic analog of the Cayley represention of a group as subgroups of permutation, by acting on itself by left multiplication. This allows us to view "functions" as "numbers" and exploit operator theoretic techniques such as factoring characteristic polynomials and differential and difference operators (recurrences), etc. The point of the common notation is to emphasize this common ring structure so that one may exploit it by reusing similar techniques where they apply.

Examples of such techniques abound. For some examples of operator algebra see here, here, here. See also here, here where the fibonacci recurrence is recast into linear system form, yielding an addition formula and fast computation algorithm by repeating squaring of the shift matrix.