I understand that in "normal" index notation the indexes can be thought of as coordinates of scalar values inside a tabular data structure, while in the abstract index notation they can not. However, I am not clear on what practical difference this makes when actually doing math. If your are doing numerical calculations then you need to plug actual components into your tensors, so that is not abstract, but is there any difference if you are doing symbolic/algebraic computations? The notations look identical, and even though the interpretation is different, expressions in both cases ultimately denote tensors. As far as I know the algebraic laws are the same. Are there manipulations that are valid in one but not in the other? If you see some tensor calculations, how can you tell if abstract index notation is being used? If you are doing differential geometry with indexes do you need to decide if your indexes are abstract or not? Or am I just missing something?
[Math] the practical difference between abstract index notation and “ordinary” index notation
differential-geometrytensors
Related Solutions
Generally speaking if you have a tensor $T$ on a manifold, and if you have a collection (of usually coordinate) vector fields $e_1, \cdots, e_n$ the "index notation" for $T$ is (lets assume for a moment $T$ is bilinear):
$$T_{ij} = T(e_i,e_j)$$
meaning $T_{ij}$ is a real-valued function for all $i$, and $j$. $T_{ij}$ is defined wherever the vector fields $\{ e_i : i = 1,2,\cdots n\}$ are defined. On a manifold with a metric (meaning an inner product on every tangent space), it is typical to define
$$g_{ij} = \langle e_i, e_j \rangle$$
where $\langle \cdot, \cdot \rangle$ is the inner product on the tangent spaces.
If the tensor takes something other than two vectors as input, for example the Riemann curvature tensor is sometimes thought of as a bilinear function from the tangent space to the space of skew-adjoint linear transformations of that tangent space, i.e. at every point $p$ of the manifold it is bilinear $T_p N \oplus T_p N \to Hom(T_p N, T_p N)$ taking values in the skew-adjoint maps (with respect to the inner product). So given $e_i, e_j \in T_p N$, $R(e_i,e_j)$ is a linear functional on the tangent space, so you could express $R(e_i,e_j)(e_k)$ as a linear combination of vectors in the dual space $T^*_p N$. The standard basis vectors of the dual space (corresponding to the collection $\{e_i\}$) is typically denoted $e_1^*, \cdots, e_n^*$. So you write $R(e_i,e_j)(e_k) = \sum_l R^l_{ijk}e^*_l$, and call $R^l_{ijk}$ the Riemann tensor "in coordinates".
In case any of this is unfamiliar, $e^*_j(e_i) = 1$ only when $i=j$ and $e^*_j(e_i) = 0$ otherwise. Or "in coordinates" $e^*_j(e_i) = \delta_{ij}$.
I think many intro general relativity textbooks explain this fairly well nowadays. When I was an undergraduate I liked:
- A First Course in. General Relativity. Second Edition. Bernard F. Schutz.
The word tensor is often abused. Firstly, a tensor is simply an element of the tensor product of some vector spaces or bimodules or something. In this sense, of course there are non-square tensors. For example an element of $V\otimes_k W$ would be called a tensor, for any $k$-vector spaces $V$ and $W$. But the words covariant and contravariant don't have any meaning here.
Secondly (and this is more closely aligned with the topic of your question), tensor might also mean a tensor (in the first sense above) valued function on a manifold. For example, let $T_p M$ denote the tangent space to a smooth manifold at the point $p\in M$. A tensor can mean a choice of element $Z_p\in T_pM\otimes \cdots \otimes T_pM \otimes (T_pM)^\ast\otimes \cdots\otimes (T_pM)^\ast$ for each point $p\in M$, which depends differentiably on $p$. For example vector fields are tensors in this sense. The words covariant and contravariant have their origins here in how the coordinates of $Z$ behave with respect to coordinate transformations on $M$.
For a "non-square" tensor of this type, one especially important example is the second fundamental form. If $M^k$ is a Riemannian manifold isometrically immeresed in some Riemannian manifold $N^{k+n}$, then the second fundamental form is roughly this: for a point $p\in M$ and a pair of tangent vectors $v,w\in T_pM\subset T_pN$, there is a normal vector $S_p(v,w)\in (T_pM)^\perp$ which is something like a second derivative (hence measures curvature). Since $S_p$ chews on two tangent vectors and spits out a normal vector, we can think of $S_p$ as an element of $(T_pM)^\perp\otimes (T_pM)^\ast\otimes (T_pM)^\ast$. This $S$ is a very important non-square tensor (dimensions $n\times k\times k$)!
For a more precise response to your questions:
Not really. Tensors don't really act on anything. However $\operatorname{End}(V)\cong V\otimes V^\ast$, so operators can be thought of tensors, but not usually vice versa. A tensor usually just means an element of a tensor product of vector spaces (mathematician) or as a tensor valued function (physicist).
I would say this is right. Without any context there's no reason to call an element of $V\otimes W^\ast$ a tensor of type $(1,1)$ or $(2,0)$, or whatever. These notions are undefined in general.
No.
Yes! See above.
Best Answer
First I'd like to share my understanding of abstract index notation. I think this understanding is simpler and more intuitive than Penrose and Rindler's original definition. Your question will be answered later with an example.
Abstract index notation is merely a labelling of the "slots" of the tensor. For example, $T_{ab}{}^c$ is just an abbreviation of $$T(-_a, -_b, -^c).$$ Each "slot" is a parameter as the tensor is viewed as a multilinear map $T:V\times V \times V^* \to \mathbb R$.
You may be already familiar with the labelling slots interpretation. But what does "labelling" a slot exactly mean? Here is my understanding: it means we can fill a specific slot with a vector (or dual vector) by specifying the label of the slot. For example, if we fill the slot labelled $a$ with a vector $u$, and fill the slot labelled $b$ with a vector $v$, and fill the slot labelled $c$ with a dual vector $\omega$, we get $T(u, v, \omega)$, that is $$ T(-_a, -_b, -^c)(\{a=u, \; b=v, \; c=\omega\}) = T(u, v, \omega). $$ Note here a set-like notation $\{a=u, \; b=v, \; c=\omega\}$ is used meaning that the order is irrelevant, $\{b=v, \; a=u, \; c=\omega\}$ and $\{a=u, \; b=v, \; c=\omega\}$ are just the same.
There are two observations from this definition of filling slots:
The position order of the slots is significant. $$S_{ab} \neq S_{ba}$$ in the sense that $$ S(-_a, -_b)(\{a=u, \; b=v\}) = S(u, v) \neq S(v, u) = S(-_b, -_a)(\{a=u, \; b=v\}). $$
An index letter can be substituted with any Latin letter, since it's just a label for the slot. For example $T_{ab} = S_{ab}$ implies $T_{cd} = S_{cd}$. Because $T_{ab} = S_{ab}$ means for any vector $u$ and $v$ $$ T(-_a, -_b)(\{a=u, \; b=v\}) = S(-_a, -_b)(\{a=u, \; b=v\}), $$ that is $$ T(u, v) = S(u, v). $$ And $T_{cd}=S_{cd}$ is equivalent to $T(u, v) = S(u, v)$ too. Note this index substitution is different from index reordering in observation 1, index substitution must be applied on both sides of an equation, We can't exchange $a$ and $b$ from only one side of the equation $S_{ab}=S_{ab}$ to get $S_{ab}=S_{ba}$.
Now we can use abstract index notation to denote tensor product, and contraction operation in a coordinate-free way. For example $U_{abcd} = T_{ab}S_{cd}$ denotes the tensor product $$ U(-_a, -_b, -_c, -_d) = T(-_a, -_b) \cdot S(-_c, -_d). $$ And $T_{ae}{}^{ed}$ denotes the contraction with respect of slots $b$ and $c$ of $T_{ab}{}^{cd}$ $$ T_{ae}{}^{ed} = C_b{}^c(T_{ab}{}^{cd}) = \sum_{\sigma}T(-_a, \frac{\partial}{\partial x^\sigma}, \mathrm dx^\sigma, -^d). $$ Another important operation is (partial) application, but since it's equivalent to a tensor product followed by a contraction, there is no need to introduce a new notation. For example applying a vector $u$ to the slot $a$ of $T_{ab}$ is $$ T(u, -_b) = T_{ab}u^a. $$ where $T_{ab}u^a$ is a tensor product of $T$ and $u$ followed by a contraction: $C_a{}^c(T(-_a, -_b)\cdot u(-^c))$. The result has one free slot left, so it's a (0, 1) tensor. That is, a (0, 2) tensor partially applied with a vector is a (0, 1) tensor.
Example
Consider the following problem: suppose $$T_{abc} = \mathrm dx^\sigma_a \mathrm dx^\mu_b \mathrm dx^\nu_c,$$ where $\sigma$, $\mu$, and $\nu$ are 3 concrete numbers, then what is $T_{abc} + T_{bca}$?
This example more or less answers your question: what is the practical difference between abstract index notation and “ordinary” index notation. Abstract index notation is easier to read and understand especially when abstract indices and concrete indices are mixed.
In abstract index notation, there is a convention that abstract indices use Latin letters while concrete indices use Greek letters. So in this example we can easily see that $a$, $b$, and $c$ are abstract indices while $\sigma$, $\mu$, and $\nu$ are concrete indices.
The notation $\mathrm dx^\sigma_a$ is not very common, but it makes sense. The dual vector $\mathrm dx^\sigma$ is naturally a function that can act on a vector. The equation $T_{abc} = \mathrm dx^\sigma_a \mathrm dx^\mu_b \mathrm dx^\nu_c$ which is just an abbreviation of $$ T(-_a, -_b, -_c) = \mathrm dx^\sigma(-_a) \cdot \mathrm dx^\mu(-_b) \cdot \mathrm dx^\nu(-_c) $$ means when slot $a$ is filled with a vector $u$, $\mathrm dx^\sigma$ will act on that $u$.
To solve the problem, use index substitution, we can get $T_{bca} = \mathrm dx^\sigma_b \mathrm dx^\mu_c \mathrm dx^\nu_a$. So $$ \begin{aligned} T_{abc} + T_{bca} &= \mathrm dx^\sigma_a \mathrm dx^\mu_b \mathrm dx^\nu_c + \mathrm dx^\sigma_b \mathrm dx^\mu_c \mathrm dx^\nu_a \\ &= \mathrm dx^\sigma_a \mathrm dx^\mu_b \mathrm dx^\nu_c + \mathrm dx^\nu_a \mathrm dx^\sigma_b \mathrm dx^\mu_c \\ &= (\mathrm dx^\sigma \otimes \mathrm dx^\mu \otimes \mathrm dx^\nu + \mathrm dx^\nu \otimes \mathrm dx^\sigma \otimes \mathrm dx^\mu)(-_a, -_b, -_c). \end{aligned} $$
This is quite straightforward. On the other hand, if you use concrete index notation to solve this problem, first you need to figure out that the components of $T$ are all zero except $$T_{\xi\eta\zeta} = 1, \text{when}\; \xi=\sigma, \eta=\mu, \zeta=\nu.$$ Or $T_{\sigma\mu\nu}=1$. But what is $T_{bca}$? $T_{\mu\nu\sigma}$? No. You need to define another tensor $S_{\xi\eta\zeta}=T_{\eta\zeta\xi}$, and figure out that its components are all zero except $$ S_{\xi\eta\zeta} = 1, \text{when}\; \xi=\nu, \eta=\sigma, \zeta=\mu. $$ Then finally find the sum $T_{\xi\eta\zeta} + S_{\xi\eta\zeta}$. This procedure is quite complex and error-prone.
If you'd like to translate the equation $$ T_{abc} = \mathrm dx^\sigma_a \mathrm dx^\mu_b \mathrm dx^\nu_c $$ to concrete index notation $$T_{\xi\eta\zeta} = \mathrm dx^\sigma_\xi \mathrm dx^\mu_\eta \mathrm dx^\nu_\zeta. $$ It doesn't help much. Now $\mathrm dx^\sigma_\xi$ is a tensor whose components are all zero except $$ \mathrm dx^\sigma_\xi = 1, \text{when}\; \xi=\sigma. $$ You still need to concern about components. This is not natural. And 6 indices are mixed together, 3 of them are fixed numbers, very confusing.