Difference between upper and lower indices in Einstein notation

general-relativitymetric-tensornotationspecial-relativitytensor-calculus

Consider a $(2,0)$ tensor $X^{\mu \nu}$ that can be represented in matrix form by:

$$X^{\mu \nu} =
\pmatrix{
a & b & c & d \\
e & f & g & h \\
i & j & k & l \\
m & n & o & p}\tag{1}$$

Here $a, b, c, …, l \in \mathbb{R}$ are scalars.

We can obtain ${{X}_{\mu}}^{\nu}$ and ${{X^{\mu}}_{\nu}}$ by using the Minkowski metric $\eta_{\nu \mu}$:
$${X^{\mu}}_{\nu} = \eta_{\nu \sigma}X^{\sigma\mu}$$
and
$${{X}_{\mu}}^{\nu} = \eta_{\mu \sigma}X^{\nu\sigma} = X^{\nu\sigma}\eta_{\sigma \mu},$$ where $\eta_{\sigma \mu} = \eta_{\mu \sigma}$, because the Minkowski metric is symmetric.

${X^{\mu}}_{\nu}$ and ${{X}_{\mu}}^{\nu}$ can be represented in matrix form by considering the matrix representations of $X^{\mu \nu}$ and $\eta_{\mu \sigma}$. ${X_{\nu}}^{\mu}$ and ${{X}_{\mu}}^{\nu}$ are transpose matrices of each other. My question is what is the representation of each one and how can we uniquely distinguish the two matrices, so that they cannot be interchanged.

${X_{\nu}}^{\mu} \neq {{X}_{\mu}}^{\nu}$ (unless ${X_{\nu}}^{\mu}$ is symmetric), but what is the representation of each matrix in terms of the elements $a, b, c, d $ etc. of ${X^{\mu}}_{\nu}$?

The context of my question is that textbooks that give the electromagnetic tensor (in the $(-,+,+,+)$ metric signature) as

$$F^{\mu \nu} =
\begin{pmatrix}
0 & E_x & E_y & E_z \\
-E_x & 0 & B_z & -B_y \\
-E_y & -B_z & 0 & B_x \\
-E_z & B_y & -B_x & 0
\end{pmatrix}\tag{2}$$

then, using the Minkowski metric, they lower each index to get

$${F^{\mu}}_ \nu =
\begin{pmatrix}
0 & E_x & E_y & E_z \\
E_x & 0 & B_z & -B_y \\
E_y & -B_z & 0 & B_x \\
E_z & B_y & -B_x & 0
\end{pmatrix}\tag{3}$$

and

$${F_{\mu}}^ \nu =
\begin{pmatrix}
0 & -E_x & -E_y & -E_z \\
-E_x & 0 & B_z & -B_y \\
-E_y & -B_z & 0 & B_x \\
-E_z & B_y & -B_x & 0
\end{pmatrix}\tag{4}$$

My question is how to distinguish the two. Surely once $F^{\mu \nu}$ is given, the terms of ${F^{\mu}}_ \nu$ and ${F_{\mu}}^ \nu$ can be found in a universally accepted way, such that we know which one's which. Is it that the upper index always numbers the different columns, and the lower index always numbers the different rows?

Best Answer

If $\mathbf X$ is a $(2,0)$-tensor, then that means that $\mathbf X$ is a map which eats two covectors $\boldsymbol \alpha$ and $\boldsymbol \beta$ and spits out a scalar $\mathbf X(\boldsymbol \alpha,\boldsymbol \beta)$. If we choose a basis $\hat e_\mu$ for the vector space and the corresponding basis $\hat \epsilon^\mu$ for the dual space (where $\hat \epsilon^\mu(\hat e_\nu) = \delta^\mu_{\nu}$), then the components $X^{\mu\nu}$ are defined by $$X^{\mu\nu} \equiv \mathbf X(\hat \epsilon^\mu, \hat \epsilon^\nu)$$


Given any $(0,2)$-tensor $\boldsymbol \eta$, we can define a map from the space of vectors to the space of covectors. A vector $\mathbf v$ is mapped to the covector $\mathbf v^\flat$ which acts on a vector $\mathbf q$ as $$\mathbf v^\flat(\mathbf q) = \boldsymbol \eta(\mathbf q,\mathbf v)$$ The components of $\mathbf v^\flat$ in a particular basis are given by $$(\mathbf v^\flat)_\mu \equiv v^\flat(\hat e_\mu) = \boldsymbol \eta(\hat e_\mu, \mathbf v)$$ $$= \boldsymbol \eta(\hat e_\mu, v^\nu \hat e_\nu) = v^\nu \boldsymbol \eta(\hat e_\mu,\hat e_\nu) \equiv \eta_{\mu\nu} v^\nu$$

If $\boldsymbol \eta$ happens to be non-degenerate, then this mapping is invertible; we define the $(2,0)$-tensor $\boldsymbol \eta^\uparrow$ whose components $(\boldsymbol \eta^\uparrow)^{\mu\nu}$ are the matrix inverse of the components $\eta_{\mu\nu}$, so $$(\boldsymbol \eta^\uparrow)^{\mu\nu} \eta_{\nu \rho} = \delta^\mu_\rho$$

With this structure in place, we can map any vector $\mathbf v$ to a covector $\mathbf v^\flat$, and any covector $\boldsymbol \alpha$ to a vector $\boldsymbol \alpha^\sharp$ which is defined as you might expect: $$\boldsymbol \alpha^\sharp(\boldsymbol \beta) = \boldsymbol \eta^\uparrow(\boldsymbol \beta,\boldsymbol \alpha)$$ $$(\boldsymbol \alpha^\sharp)^\mu = (\boldsymbol \eta^\uparrow)^{\mu\nu} \alpha_\nu$$


Having defined this so-called musical isomorphism between vectors and covectors, we can address the remainder of your question. Given a $(2,0)$-tensor $\mathbf X$ and a special choice of non-degenerate $(0,2)$-tensor $\eta$, we can define a $(1,1)$-tensor $\tilde {\mathbf X}$ which takes a covector $\boldsymbol \alpha$ in its first slot and a vector $\mathbf v$ in its second slot and returns the value $$\tilde{\mathbf X}(\boldsymbol \alpha,\mathbf v) = \mathbf X(\boldsymbol \alpha, \mathbf v^\flat) \qquad (\tilde {\mathbf X})^\mu_{\ \ \nu} \equiv \tilde {\mathbf X}(\hat \epsilon^\mu,\hat e_\nu) = X^{\mu\rho}\eta_{\nu\rho}$$ We could also define a $(1,1)$-tensor $\tilde {\mathbf X}^T$ which takes a vector $\mathbf v$ in its first slot and a covector $\boldsymbol \alpha$ in its second slot and returns the value $$\tilde {\mathbf X}^T(\mathbf v,\boldsymbol \alpha)= \mathbf X(\mathbf v^\flat,\boldsymbol \alpha) \qquad (\tilde {\mathbf X}^T)_\mu^{\ \ \nu} \equiv \tilde {\mathbf X}^T(\hat e_\mu, \hat \epsilon^\nu) = \eta_{\mu\rho} X^{\rho\nu}$$


Much of this notation may appear foreign, but that's because it's traditional to drop the symbols $\sharp,\flat,\tilde{\ }, ^T$ and $\uparrow$ and distinguish e.g. $\boldsymbol \eta$ from $\boldsymbol \eta^\uparrow$ or $\mathbf v$ from $\mathbf v^\flat$ by the placement of the indices on their components. This saves a lot of writing and makes the expressions look neater, but comes at the cost of conceptual clarity. That's why I prefer to keep the extra symbols until a student gets so comfortable with them that they feel cumbersome and annoying, and only then drop them to yield expressions like $X_\mu^{\ \ \nu} = \eta_{\mu\rho} X^{\rho\nu}$, which (naively) would seem to suggest that the $X$ on the left is the same as the $X$ on the right when in fact they are different objects.

My question is what is the representation of each one and how can we uniquely distinguish the two matrices, so that they cannot be interchanged.

If you write down a list of numbers, there's no way for me to tell whether those numbers are the components of a covector or the components of a vector. You have to tell me in order for me to be able to interpret what you're saying correctly. Similarly, if you write down a square grid of numbers, I have no idea whether they're the components of a $(2,0)$-tensor, or a $(0,2)$-tensor, or a $(1,1)$-tensor, or of a linear transformation. You must provide this context.

Is it that the upper index always numbers the different columns, and the lower index always numbers the different rows?

Tradition dictates that the first (i.e. left-most) index labels the rows and the second index labels the columns, but that's just a convention. Whether the index is upstairs or downstairs is determined by whether the corresponding slot of the tensor eats a covector or vector, respectively. This in turn has implications for how the components of the tensor transform under a generic change of basis.