[Math] Associvity and Distributive property of Matrix Multiplication

associativitylinear algebramatrices

Matrix multiplication is associative $(AB)C=A(BC)$

What about the case when $AB$ results in a scalar?
Considering the case when $A$ is $1 \times n$ dimensional, $B$ and $C$ are both $n \times 1$ dimensional, the product $AB$ would result in a scalar which can be multiplied by $C$, i.e., $(AB)C=kC$.

However, $BC$ can not multiplied. Hence $A(BC)$ is not possible… I must be missing something here!

I came across this when checking gradient of least squares function… which resulted in $(y – W'\phi(x))\cdot\phi(x)$ where $y$ is scalar, $\phi(x)$ and $W$ are both $n$ dimensional column vector.
By distributive property, multiplying resulted in the multiplication $W'\cdot\phi(x)\cdot\phi(x)$

$W' = \operatorname{tranpose}(W)$

Best Answer

You are right that, formally speaking, we cannot multiply a $1\times1$ matrix with a matrix of any size. However, we have the following identifications, which respect the linear structure:

$$ \{ \text{Scalars} \} \leftrightarrow \{ 1 \times 1 \text{ matrices} \} \leftrightarrow \{ \text{Scalar matrices of any size} \}.$$

Thus, under this identification, it makes sense to multiply any matrix $C$ by a scalar $\lambda$: if $C$ has size $m \times n$, you can view $\lambda$ as the $m \times m$ scalar matrix $$\begin{pmatrix} \lambda & & &\\ & \lambda & &\\ & & \ddots & \\ & & & \lambda\end{pmatrix} = \lambda I_m$$ and the product $\lambda C$ corresponds to $(\lambda I_m) C$, which is a well-defined matrix product.


In your particular problem, you are considering $(y-W'\phi)\cdot \phi$, where $\phi$ is $n \times 1$. According to what I said above, if you want to interpret the scalar multiplication "$\cdot$" as matrix multiplication, you need to view $y$ and $W'\phi$ as $n \times n$ scalar matrices. But really in either case all that you are doing is multiplying all the entries of $\phi$ by the scalar $(y - W'\phi)$. I think you are confused because you think the operation between $W'$ and $\phi$ is the same as the operation $\cdot$ between $W'\phi$ and $\phi$. But the first one is matrix multiplication, whereas the second one is scalar-matrix multiplication.

Related Question