Indices of the Minkowski metric and Lorentz transformation

matrix elementsmetric-tensorspecial-relativity

I am currently in the process of studying special Relativity and I keep stumbling over a concept I can't make consistent for myself.
It is about the fact which index of a Lorentz transform and the Minkowski metric denotes a column and which one denotes the row.

My thoughts so far:
If I take a look at the Matrix multiplication $\mathbf{x}^{'\nu}=\mathbf{\Lambda}^\nu\,_\mu\cdot\mathbf{x}^\mu$, then the upper index $\mu$ must be the Index indicating a row as $\mathbf{x}$ is a column vector and therefor it's Index must specify the entry (row). Same holds true for the upper index $\nu$ of $\mathbf{x}^{' \nu}$. Additionally I know that the product describes a matrix multiplication with a vector and due to the Einstein sum rule the expression is summed over $\mu$. So if I am interested in the first entry of $\mathbf{x}^{'}=\mathbf{x}^{'0}$ I have to multiply the first row of $\mathbf{\Lambda}=\mathbf{\Lambda}^0$ with the column of $\mathbf{x}$. Therefor the upper index $\nu$ of the Lorentz transformation describes the row of the matrix and the lower index $\mu$ the column. So far so clear.\

Now I encounter some difficulties. For example our professor writes the following: $x_\nu=\eta_{\mu\nu}x^\mu$. (first question: strictly speaking and also written here https://en.wikipedia.org/wiki/Raising_and_lowering_indices (example in Minkowski spacetime): is $x_\nu$ a row vector?). The same reasoning as above cannot be used here as we have no upper/lower index. But after the same logic in $\mathbf{x}^\mu$, $\mu$ describes a row index and therefor in $\eta_{\mu\nu}$ $\nu$ must describe a row again and $\mu$ a column. So now the latter index is the row index (after my understanding). Now the inconsistencies start: In the book here equation (5.12)+(5.13) the authors say that $\Lambda^\mu\,_\alpha \eta_{\mu\nu}\Lambda^\nu\,_\beta$ is not a matrix multiplication as the index $\mu$ is a column index in the first Lorentz transformation as well as in the Minkowski metric.

I also know (very little to be honest) about Tensors and that they play an important role in the lowering and raising of indexes, but there has to be a self consistent answer to my Index problem somewhere. I have yet to find a satisfying answer to my problem and would be grateful for any help you can provide.

Edit: I have found a post where in one answer a reference to another question is given where the author says the left most index indicates the row. That would at least support my claim for the index label of the Lorentz transformation, yet the problem with the Minkowski metric remains.

Best Answer

When tensor notation is first introduced, it can help the beginner student to show matrix equations that do the same thing, so that they can see it is just a linear transformation. A common convention is to have column matrices representing contravariant vectors (the ones with with upper indices), and write a chain of matrix multiplications in tensor notation as $x^\alpha=A^\alpha_\beta B^\beta_\gamma C^\gamma_\delta\dots X^\mu_\nu x^\nu$ sorted into an order where the lower index of each term is the same as the upper index of the following term. But matrices can only handle a handful of cases - where you have vectors or 1-forms that can be written $x^\mu$ or $x_\mu$ respectively, and $\left(\begin{smallmatrix}1\\1\end{smallmatrix}\right)$-tensors $X^\mu_\nu$. Any other sort of tensor, and the analogy breaks. And with matrices you have to get the order right, or again it breaks. A row-vector times a column vector is not the same thing as a column-vector times a row-vector.

So the best thing to do, once the idea has been introduced, is to emphasise to the student that tensors are not the same thing as matrices. After the first week, don't try to interpret everything as matrices or row/column vectors, because in a lot of cases it doesn't work, and it's likely to mislead you.

A tensor, in coordinate form, is better thought of as an $n$-dimensional array of numbers, without worrying about what direction each axis is in. The dimensions are better thought of as first, second, third, ... (upper or lower) index rather than row, column, rising-up-out-of-the-page, ... so we don't get stuck once we get beyond two or maybe three dimensions. The rule for combining them is the Einstein summation convention, which says you sum over each repeated pair of upper/lower indices. That means you can write them in any order, because the index labels will tell you which dimensions to combine. It means you can have arrays of three or four or five dimensions, and not have to worry about how to extend the convention "rows in the first matrix are multiplied by columns in the second matrix" to something that won't fit into a flat 2D page. It means you can combine dimensions in tensors that are not sat next to one another. It's far more powerful and general, and in some ways simpler.

Related Question