[Math] an intuitive way to understand the dot product in the context of matrix multiplication

intuitionlinear algebralinear-transformations

I was trying to understand where it came from that each row in a matrix multiplication is a dot product, as in:

$$
Ax = \left(
\begin{array}{ccc}
a_{1}^T \\
\vdots \\
a_m^T \end{array}
\right)x = \left(
\begin{array}{ccc}
a_{1}^Tx \\
\vdots \\
a_m^T x \end{array}
\right)
$$

what is an intuitive explanation or interpretation that each row is a dot product of the vector x?

What I do understand is that $Ax$ encodes a linear transformation $T$. Consider a super simple example in 2 dimensions to explain what I do understand. I understand that $Ax = A [x_1 x_2] = T(v) = T(x_1 \hat i + x_2 \hat j) = x_1 T(\hat i) + x_2 T( \hat j)$. This makes me interpret intuitively that a multiplication by a matrix gives me a new vector that is composed of the same linear combination of the transformed basis vectors (or whatever vectors v is composed of)[source]. Furthermore one can easily see from this view where the multiplication of a matrix comes from:

$$Ax = \left[
\begin{array}{ccc}
a_{11} & a_{12} \\
a_{21} & a_{22} \\
\end{array}
\right]
x = \left[
\begin{array}{ccc}
T(\hat i)_1 & T(\hat j)_1\\
T(\hat i)_2 & T(\hat j)_2\\
\end{array}
\right]
\left[
\begin{array}{ccc}
x_{1} \\
x_{2} \\
\end{array}
\right]
=
x_1\left[
\begin{array}{ccc}
T(\hat i)_1 \\
T(\hat i)_2 \\
\end{array}
\right]
+
x_2
\left[
\begin{array}{ccc}
T(\hat j)_1\\
T(\hat j)_2\\
\end{array}
\right]
=
\left[
\begin{array}{ccc}
T(\hat i)_1 x_1 + T(\hat j)_1 x_2\\
T(\hat i)_2x_2 + T(\hat j)_2 x_2\\
\end{array}
\right]
$$

where now its obvious why matrix multiplication is defined the way it is (because of linear transformations). Notice that the nice thing about this view is that one can interpret that each column of the matrix tells us how each basis vector changes. i.e. each column specifies how $\hat i$, $\hat j$ are transformed. Furthermore, the amount it used to be in the old vector is retained but now its in the new direction $T(\hat i)$ for the first coordinate. This for me is really intuitive and explains a lot of where matrix multiplication comes from.

However, if you notice this view reveals that each row $(Ax)_i = a_1^T x$ is a dot product of the initial array representation of the vector. This seems to me to not be a coincidence and that something deeper has to be going on. Usually dot products are related with projections so I was trying to understand if each coordinate of $(Ax)_i$ might actually be encoding how much the original $x$ is being projected into each row vector of $A$ (or possible something to do with the row space of $A$ i.e. $C(A^T)$ ). In an attempt to understand this I considered what each row means:

$$ \left[ a_{i,1} \dots a_{i,m} \right] \left[ \begin{array}{ccc}
x_1\\
\vdots\\
x_n \\
\end{array} \right] = \sum^n_{j=1} a_{ij} x_j$$

in the old interpretation I had of what a column of a matrix is (this time the matrix is 1 by m), it seems that the columns $a_{i,j}$ specifies how much some basis vector $e_i$ is transformed. However, I've had difficulties understanding beyond that what the significance of the dot product of $x$ with the rows of $A$ means. Does someone know how to interpret this or how to understand it at a conceptual level, similar to the way the interpretation I gave of what the columns of a matrix mean? Are we doing some transformation to the row space of $A$ or something like that?

Best Answer

We have $x = x_i e_i = x'_i e'_i$ where $e_i$ and $e'_i$ are bases related by nonsingular linear transformation. Note that ${e'}_i^T e'_j = g_{ij}$, where $g$ is invertible. Thus, ${e'}_i^T e_j x_j = {e'}_i^T e'_j x'_j = g_{ij}x'_j$ or $$x'_i = (g^{-1})_{ij}{e'}_j^T e_k x_k = (g^{-1})_{ij}{e'}_j^T x.$$ This gives us two good pieces of intuition. First, for a nonsingular linear transformation $A$ we can think of the elements of $A$ as being given by $$a_{ij} = (g^{-1})_{ik}{e'}_k^T e_j,$$ that is, by the dot product of a certain linear combination of the transformed basis vectors with the untransformed basis vectors. Second, to find the result of applying $A$ to $x$ we simply dot the same linear combination of the transformed basis vectors with the vector $x$.

For orthogonal transformations we find $g_{ij} = \delta_{ij}$ and so $$x'_i = {e'}_i^T e_j x_j = {e'}_i^T x \hspace{5ex}\textrm{and}\hspace{5ex} a_{ij} = {e'}_i^T e_j.$$

Note: We use Einstein's summation convention, $x = x^i e_i \equiv \sum_i x^i e_i$. For this problem the dual basis is $e^i = e_i^T$. The dual of $x$ is $x^T$, so $x_i e^i = x^i e_i^T$. We need not distinguish between $x_i$ and $x^i$ and so we write $x = x_i e_i$.

Example

Let $$A = \left(\begin{array}{cc}\cos\theta & \sin\theta \\ -\sin\theta & \cos\theta \end{array}\right).$$ Then $$\left(\begin{array}{cc}\cos\theta & \sin\theta \\ -\sin\theta & \cos\theta \end{array}\right) \left(\begin{array}{c}x \\ y\end{array}\right)$$ will give the components of $x$ in the new basis $e'_i$, where $[e_i]_j = \delta_{ij}$ is the standard basis. (This is a passive, rather than active, transformation.) It is straightforward to show that $e'_i = A^{-1}e_i = A^T e_i,$ so $$e'_1 = \left(\begin{array}{c}\cos\theta \\ \sin\theta\end{array}\right) \hspace{5ex}\textrm{and}\hspace{5ex} e'_2 = \left(\begin{array}{c}-\sin\theta \\ \cos\theta\end{array}\right).$$ One can then easily check that the elements of $A$ are given by $a_{ij} = {e'}_i^T e_j$. Note that, $$x'_1 = {e'}_1^T e_j x_j = \left(\begin{array}{cc}\cos\theta & \sin\theta\end{array}\right) \left(\begin{array}{c}x \\ y\end{array}\right)$$ and $$x'_2 = {e'}_2^T e_j x_j = \left(\begin{array}{cc}-\sin\theta & \cos\theta\end{array}\right) \left(\begin{array}{c}x \\ y\end{array}\right),$$ as expected.

Best Answer

Related Solutions

[Math] Intuitive way to understand covariance and contravariance in Tensor Algebra

[Math] Matrix Multiplication – Product of [Row or Column Vector] and Matrix [Lay P94, Strang P59]

Related Question