Matrices – Confusion Regarding Vector/Matrix Multiplication in Index Notation

intuitionmatricessolution-verificationtensorsvectors

I came across this question and answer (sorry I don't have an electronic source for it, only a paper copy). After reading the answer it had me questioning the notation one uses to denote row/column vectors in tensor expressions:

Consider the following contravariant vectors: $a^\mu=(1,1,0,0)$, $b^\mu=(0,1,0,0)$ and $c^\mu=(0,0,0,1)$.

Derive the following quantities: $\phi=a_\mu b^\mu$, $\psi=a_\mu a^\mu$, $V_\mu=a_\nu b_\mu a^\nu$, $W^{\nu\mu}=c^\nu a^\mu$, $P_{\mu\nu}=a_\mu b_\nu$ and $Q_\mu^{\,\,\,\nu}=b_\mu c^\nu$.


Here are the solutions:

First we can construct the covariant versions of these vectors, $a_\mu=(-1,1,0,0)$, $b_\mu=(0,1,0,0)$ and $c_\mu=(0,0,0,1)$, then we have
$$\phi=a_\mu b^\mu=a_0b^0+a_1b^1+a_2b^2 +a_3b^3=1\tag{a}$$
$$\psi=a_\mu a^\mu =a_0a^0+a_1a^1+a_2a^2+a_3a^3=-1+1=0\tag{b}$$
$$V_\mu=a_\nu b_\mu a^\nu=\left(a_\nu a^\nu\right)b_\mu=0\qquad \text{(using the result (b))}\tag{c}$$

$$W^{\nu\mu}=c^\nu a^\mu = \begin{pmatrix}0&0&0&0\\0&0&0&0\\0&0&0&0\\1&1&0&0\end{pmatrix}\tag{d}$$
$$P_{\mu\nu}=a_\mu b_\nu = \begin{pmatrix}0&-1&0&0\\0&1&0&0\\0&0&0&0\\0&0&0&0\end{pmatrix}\tag{e}$$
$$Q_\mu^{\,\,\,\nu}=b_\mu c^\nu=\begin{pmatrix}0&0&0&0\\0&0&0&1\\0&0&0&0\\0&0&0&0\end{pmatrix}\tag{f}$$

In the above extract the metric signature used is $\eta_{\mu\nu}=\text{diag}(-1,1,1,1)$ and for consistency I will follow this convention in the rest of this post.


That's the end of the solutions, my problem is that I just don't understand why both the $a^\mu,\,b^\mu,\,c^\mu$ and $a_\mu,\,b_\mu,\,c_\mu$ are written as row vectors. I think either the contravariant or covariant vectors must be written as column vectors for matrix multiplication to even make sense. For example, to compute $a_\mu$ I could write
$$a_\mu=\eta_{\nu\mu}a^\nu=\eta_{\mu\nu}a^\nu=\begin{pmatrix}-1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&0&1\end{pmatrix}\begin{pmatrix}1\\1\\0\\0\end{pmatrix}=\begin{pmatrix}-1\\1\\0\\0\end{pmatrix}=\begin{pmatrix}-1&1&0&0\end{pmatrix}^T\tag{1}$$
which is in the right form for matrix multiplication to apply, a $(4\times 4)\times(4\times 1)$ matrix product. In the second equality I used the fact that the Minkowski metric is symmetric, $\eta_{\mu\nu}=\eta_{\nu\mu}$

Now this is the part that really confuses me, I'm to understand that tensor expressions are commutative – since all the information is captured in the repeated (contracted) indices, just like in expression $(\mathrm{c})$ the $b_\mu$ was commuted past the $a^\nu$. So if what I'm saying is really true, I may also write
$$a_\mu=a^\nu\eta_{\nu\mu}=\begin{pmatrix}1&1&0&0\end{pmatrix}\begin{pmatrix}-1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&0&1\end{pmatrix}=\begin{pmatrix}-1&1&0&0\end{pmatrix}\tag{2}$$
which is also in the right form for matrix multiplication to apply, a $(1\times 4)\times(4\times 4)$ matrix product. This also turns out to be the same expression the author got for $a_\mu$.

But $(1)\ne(2)$, so why don't I get the same result?


Remark:

At first sight this question may look so trivial and stupid that it should just be downvoted without mercy. But I ask you to please bear with me, this is not obvious to me at all.


Update in response to comments by @JackozeeHakkiuz and @Kurt G.

In what follows I'm going to write out every little tedious step in the calculation, this may be painful or incredibly annoying to read for those of you well-acquainted with tensor expression manipulations. But I am a beginner to this, and it is clear that I do not understand how to work with tensors, so I feel I must labour the point. I'm sorry about this.

Firstly to address the commutativity of $a_\mu=a^\nu\eta_{\mu\nu}=\eta_{\mu\nu}a^\nu$.

Now I mentioned in the comments that I do not wish to compute these four-vectors using matrix-multiplication (as that was what lead to the confusion to begin with). Instead I will use the Einstein summation convention and then try to explain what the problem is:

So, starting with $a^\nu\eta_{\mu\nu}$, $$a_\mu=a^\nu\eta_{\mu\nu}=a^\nu\eta_{\nu\mu}=\begin{pmatrix}a^0\eta_{00} + a^1 \eta_{10} + a^2\eta_{20}+ a^3\eta_{30}\\a^0\eta_{01} + a^1 \eta_{11} + a^2\eta_{21}+ a^3\eta_{31}\\a^0\eta_{02} + a^1 \eta_{12} + a^2\eta_{22}+ a^3\eta_{23}\\a^0\eta_{03} + a^1 \eta_{13} + a^2\eta_{23}+ a^3\eta_{33}\end{pmatrix}$$

$$=\begin{pmatrix}a^0\eta_{00} + 0 + 0 + 0\\0 + a^1 \eta_{11} + 0 + 0\\0 + 0 + a^2\eta_{22}+ 0\\0 + 0 + 0+ a^3\eta_{33}\end{pmatrix}=\begin{pmatrix}a^0\eta_{00}\\a^1 \eta_{11} \\a^2\eta_{22}\\a^3\eta_{33}\end{pmatrix}=\begin{pmatrix}1\times (-1)\\ 1\times (1) \\0 \times (1)\\ 0\times (1)\end{pmatrix}=\begin{pmatrix}-1\\ 1 \\ 0 \\ 0\end{pmatrix}$$

Where I have used the four-vector $a^\mu=(1,1,0,0)$ and the fact that $\eta$ is diagonal: $\eta_{00}=-1$, $\eta_{11}=\eta_{22}=\eta_{33}=1$ which indeed matches the expression for $a_\mu$ in the author's solution above.

Now for the reverse order, $\eta_{\mu\nu}a^\nu$,

$$a_\mu=\eta_{\mu\nu}a^\nu=\begin{pmatrix}\eta_{00}a^0 + \eta_{01}a^1 + \eta_{02}a^2+ \eta_{03}a^3\\\eta_{10}a^0 + \eta_{11}a^1 + \eta_{12}a^2+ \eta_{13}a^3\\\eta_{20}a^0 + \eta_{21}a^1 + \eta_{22}a^2+ \eta_{23}a^3\\\eta_{30}a^0 + \eta_{31}a^1 + \eta_{32}a^2+ \eta_{33}a^3\end{pmatrix}$$

$$=\begin{pmatrix}\eta_{00}a^0 + 0 + 0 + 0\\0 + \eta_{11}a^1 + 0 + 0\\0 + 0 + \eta_{22}a^2+ 0\\0 + 0 + 0+ \eta_{33}a^3\end{pmatrix}=\begin{pmatrix}\eta_{00}a^0\\ \eta_{11}a^1 \\\eta_{22}a^2\\\eta_{33}a^3\end{pmatrix}=\begin{pmatrix}(-1)\times 1\\ (1)\times 1 \\(1) \times 0\\ (1)\times 0\end{pmatrix}=\begin{pmatrix}-1\\ 1 \\ 0 \\ 0\end{pmatrix}$$
Which shows that $a_\mu=a^\nu\eta_{\mu\nu}=\eta_{\mu\nu}a^\nu$ (tensor factors commute).


But do factors in tensor expressions always commute?

For instance, I am going to use eqn. $(\mathrm{d})$ in the author's solution as an example.

I could compute $W^{\nu\mu}$ in two different ways:

Firstly, I could write
$$W^{\nu\mu}=c^\nu \color{red}{a^\mu} = \begin{pmatrix}c^0a^0&c^0a^1&c^0a^2&c^0a^3\\c^1a^0&c^1a^1&c^1a^2&c^1a^3\\c^2a^0&c^2a^1&c^2a^2&c^3a^3\\c^3a^0&c^3a^1&c^3a^2&c^3a^3\end{pmatrix}$$

$$=\begin{pmatrix}0\times (1)& 0\times(1)&0\times (0)&1\times (0)\\0\times (1)& 0\times(1)&0\times (0)&0\times (0)\\0\times (1)& 0\times(1)&0\times (0)&0\times (0)\\1\times (1)& 1\times(1)&1\times (0)&1\times (0)\end{pmatrix}=\begin{pmatrix}0&0&0&0\\0&0&0&0\\0&0&0&0\\1& 1&0&0\end{pmatrix}$$
Great, that matches the expression $(\mathrm{d})$ as written by the author, where I used $c^\mu=(0,0,0,1)$ and $a^\mu=(1,1,0,0)$.

But I'm to understand that factors in tensor expressions are commutative, so what is to stop me from writing
$$W^{\nu\mu}\stackrel{\color{blue}{?}}{=}\color{red}{a^\mu} c^\nu = \begin{pmatrix}a^0c^0&a^0c^1&a^0c^2&a^0c^3\\a^1c^0&a^1c^1&a^1c^2&a^1c^3\\a^2c^0&a^2c^1&a^2c^2&a^2c^3\\a^3c^0&a^3c^1&a^3c^2&a^3c^3\end{pmatrix}$$

$$=\begin{pmatrix}(1)\times 0& (1)\times 0&(1)\times 0&(1)\times 1\\(1)\times 0& (1)\times 0&(1)\times 0&(1)\times 1\\(0)\times 0& (0)\times 0&(0)\times 0&(0)\times 1\\(0)\times 0& (0)\times 0&(0)\times 0&(0)\times 1\end{pmatrix}=\begin{pmatrix}0&0&0&1\\0&0&0&1\\0&0&0&0\\0& 0&0&0\end{pmatrix}?$$

So $$\bbox[5px,border:2px solid darkgreen]
{W^{\nu\mu}\stackrel{\color{blue}{?}}{=}c^\nu \color{red}{a^\mu}\ne \color{red}{a^\mu} c^\nu}$$

I think I'm misunderstanding the comments regarding the commutativity, but from what was written it was my impression that all information is captured in the tensor indices, hence I can commute tensor factors at will in any tensor expression. This was mentioned for the case of $$a_\mu=a^\nu\eta_{\mu\nu}=\eta_{\mu\nu}a^\nu$$ I assumed this statement generalized to any tensor expression. Clearly this is not the case for $W^{\nu\mu}$ so there is something else going on here I don't understand.



CORRECTIONS:

Sorry for the late edit, I've just realised I made a typo for second expression involving $W^{\nu\mu}$; in the previous version I (mistakenly) just switched the order of the indices, $\nu$ and $\mu$ instead of the four-vectors $a^\nu$ and $c^\mu$. I have marked the relevant areas in red and it is the equation in the green box that I am questioning. Apologies for the excessively long post.

Best Answer

All of your equalities are correct except fot the ones marked in red

\begin{align} a_\mu &=\eta_{\nu\mu}a^\nu \\ &=\eta_{\mu\nu}a^\nu \\ &\mathbf{\color{red}{=}} \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} 1 \\ 1 \\ 0 \\ 0 \end{pmatrix} \\ &= \begin{pmatrix} -1 \\ 1 \\ 0 \\ 0 \end{pmatrix} \\ &=\begin{pmatrix} -1&1&0&0 \end{pmatrix}^T \tag{1} \end{align}

\begin{align} a_\mu &=a^\nu\eta_{\nu\mu} \\ &\mathbf{\color{red}{=}} \begin{pmatrix}1&1&0&0\end{pmatrix}\begin{pmatrix}-1&0&0&0\\ 0&1&0&0\\ 0&0&1&0\\ 0&0&0&1\end{pmatrix} \\ &=\begin{pmatrix}-1&1&0&0\end{pmatrix} .\tag{2} \end{align} You should be suspicious of these two equalities, the ones marked in red: in both cases you have a number on the left hand side and a matrix on the right hand side. Strictly speaking they make no sense, but to be more precise these equalities are two abuses of notation. The problem arises because you are abusing notation in two different, inconsistent, ways.

E.g. take the vector $v$ with components $(1,1,2,8)$. It is not entirely correct to write either $$v^\mu = \begin{pmatrix} 1 & 1 & 2 & 8 \end{pmatrix}$$ nor $$v^\mu = \begin{pmatrix} 1 \\ 1 \\ 2 \\ 8 \end{pmatrix}$$ because the left hand side are just numbers, agnostic about whether you put them in a row or in a column, while on the right hand sides you have a row or a column matrix (respectively), there is no more freedom, the choice has been made. The problem is that there is no way to make consistent choices everywhere at once.

As I said above, matrices are just a tool to compute several numbers at once. You can write the same numbers in a row or in a column at will, and both the row and the column matrices will contain the components $v^\mu$ of $v$. If you want, you could introduce notation to make this choice explicit: $\DeclareMathOperator{\row}{row}\DeclareMathOperator{\col}{col}\DeclareMathOperator{\matr}{matr}$ so that $$\row(v^\mu) = \begin{pmatrix} 1 & 1 & 2 & 8 \end{pmatrix}$$ and $$\col(v^\mu) = \begin{pmatrix} 1 \\ 1 \\ 2 \\ 8 \end{pmatrix}$$ Moreover, if $b_{\mu\nu}$ are sixteen numbers, also write $$\matr(b_{\mu\nu}) = \begin{pmatrix} b_{00} & b_{01} & b_{02} & b_{13} \\ b_{10} & b_{11} & b_{12} & b_{13} \\ b_{20} & b_{21} & b_{22} & b_{23} \\ b_{30} & b_{31} & b_{32} & b_{33} \\ \end{pmatrix}$$ so that your calculations become correct (now marked green)

\begin{align} \col(a_\mu) &=\col(\eta_{\nu\mu}a^\nu) \\ &=\col(\eta_{\mu\nu}a^\nu) \\ &=\mathbf{\matr(\eta_{\mu\nu})}\col(a^\nu) \\ &\color{green}{=} \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} 1 \\ 1 \\ 0 \\ 0 \end{pmatrix} \\ &= \begin{pmatrix} -1 \\ 1 \\ 0 \\ 0 \end{pmatrix} \\ &=\begin{pmatrix} -1&1&0&0 \end{pmatrix}^T \tag{1} \end{align}

\begin{align} \row(a_\mu) &=\row(a^\nu\eta_{\nu\mu}) \\ &=\row(a^\nu)\matr(\eta_{\nu\mu}) \\ &\mathbf{\color{green}{=}} \begin{pmatrix}1&1&0&0\end{pmatrix}\begin{pmatrix}-1&0&0&0\\ 0&1&0&0\\ 0&0&1&0\\ 0&0&0&1\end{pmatrix} \\ &=\begin{pmatrix}-1&1&0&0\end{pmatrix} \tag{2} \end{align}

So, in general, for any $4$ numbers $a^{\mu}$ and for any sixteen numbers $b_{\mu\nu}$, you have

\begin{align} \row(b_{\mu\nu}a^\mu) &=\row(a^\mu)\matr(b_{\mu\nu}) \\ \row(b_{\nu\mu}a^\mu) &=\row(a^\mu)\matr(b_{\nu\mu})^T \\ \col(b_{\mu\nu}a^\mu) &=\matr(b_{\mu\nu})^T\col(a^\mu) \\ \col(b_{\nu\mu}a^\mu) &=\matr(b_{\nu\mu})\col(a^\mu) \\ \end{align} etc.

I hope this made it clear that thecommutativity of the expressions is not the issue at all. The problem was that in one derivation you were implicitly converting a tuple of four numbers into a row, and on the other derivation you were converting that same tuple of numbers into a column. Obviously you would get different objects.