[Math] Why use Einstein Summation Notation

notationsummation

Einstein summation convention dictates that repeated indices should be summed. Thus the equation
$a_{ij} = b_{ik}c_{kj}$
is taken to mean
$a_{ij} = \sum_k b_{ik}c_{kj}$
where in both cases the range of summation is implicit.

Oftimes when I have come across this notation, it is followed by the statement "where summation over index $k$ is implicit." This seems to defeat the point of Einstein notation (to reduce clutter in equations).

Othertimes, the summation is not obvious (as it may be above). For example, if asked to evaluate $F^{\mu \nu}F_{\mu \nu}$, one might think that the answer depends on the values of $\mu$ and $\nu$, but in actual fact, summation is implied.

Given these ambiguities and failure to reduce clutter (well, rather trading clutter in equations to clutter in text), why should one use Einstein notation?

Best Answer

What is Einstein's summation notation?

While Einstein may have taken it to be simply a convention to sum "any repeated indices", as Zev Chronocles alluded to in a comment, such a summation convention would not satisfy the "makes it impossible to write down anything that is not coordinate-independent" property that proponents of the convention often claim.

In modern geometric language, one should think of Einstein's summation convention as a very precise way to express the natural duality pairings/contractions when looking at a multilinear object.

More precisely: let $V$ be some vector space and $V^*$ its dual. There is a natural bilinear operation taking $v\in V$ and $\omega\in V^*$ to obtain a scalar value $\omega(v)$; this could alternatively be denoted as $\omega\cdot v$ or $\langle \omega,v\rangle$. This duality pairing can also be called contraction and sometimes denoted by $\mathfrak{c}: V\otimes V^* \to \mathbb{R}$ (or different scalar field if your vector space is over some other field).

Now, letting $\eta$ be an arbitrary element of $V^{p,q}:= (\otimes^p V)\otimes (\otimes^q V^*)$, as long as $p,q$ are both positive, we can take a contraction between any one factor of $V$ against any other factor of $V^*$. Each one of these contractions give a mapping $V^{p,q} \to V^{p-1,q-1}$, and it is tedious to name every one of them (you can index each one by calling $\mathfrak{c}_{i,j}$ the contraction between the $i$th factor of $V$ with the $j$th factor of $V^*$).

The Einstein convention gets around this by being an index convention, where $\eta$ is written as $\eta^{i_1\cdots i_p}_{j_1\cdots j_q}$, an indexed object, each of the index corresponds to one of the $V$ or $V^*$ factors. Then instead of $\mathfrak{c}_{i,j}$, we just single out the relevant factor in the index and trace over it. For example $$ \mathfrak{c}_{1,1}(\eta)^{i_1\cdots i_{p-1}}_{j_1 \cdots j_{q-1}} = \eta^{k i_1\cdots i_{p-1}}_{k j_1 \cdots j_{q-1}} $$ where the summation symbol over $k$ is suppressed. For one single tensor the advantage of this notation is not clear, but for multiple contractions, you see the advantage

$$ \mathfrak{c}_{1,1} \mathfrak{c}_{p,q} \eta = \mathfrak{c}_{p-1,q-1} \mathfrak{c}_{1,1} \eta $$

if $\eta \in V^{p,q}$. Basically, if you have multiple contractions on one expression, you will have to keep careful track of the level of contractions to put in the correct indices in the contraction symbol; in particular the symbols are not commutative. The same expression above in Einstein notation would only be

$$ \eta^{k i_1\cdots i_{p-2} \ell}_{k j_1\cdots j_{q-2} \ell} $$

and it is immediately clear which slots are contracted together. Furthermore, it is manifest that the "formulae" obtained thus are independent of the choice of basis of $V$ and $V^*$ (with respect to which we can write down the actual components of $\eta$).

What is the correct use of Einstein's notation?

  1. Einstein's notation should only be used to denote contraction of one contravariant slot with one covariant slot. That's it. Don't sum over two covariant slots. Don't use triply-repeated indices. If you limit it to these kinds of contractions, you are using it to denote a "natural operation" and therefore will never get you expressions that are coordinate-dependent/non-geometric.
  2. This is especially an issue in Lorentzian or other pseudo-Riemannian geometric set-ups, or in situation where you don't have a metric at all. That in Riemannian geometry often times we can get away with doing contraction of a pair of covariant indices or a pair of contravariant indices is that there is a natural isomorphism (given by the metric) between $V$ and $V^*$ in this situation. Furthermore, in usual convention this isomorphism doesn't "change sign". In the situation without any metric there are no preferred isomorphism between $V$ and $V^*$, and so the bilinear map $V\otimes V\to \mathbb{R}$ would necessarily be coordinate dependent. In the Lorentzian case there can be sign issues if you are not careful.
  3. Einstein's summation convention takes advantage of the fact that the dual pairing $\omega(v)$ can be expressed as first taking the tensor product $\omega\otimes v$ then taking the contraction. So you should only use it when this procedure makes sense: don't use it to do elementwise division, for example.
  4. Einstein's summation convention should be used when there are no "coordinate dependent manipulations". In particular, if you ever find the need to speak of one particular component of a tensor when expressed in one particular coordinate system, then you should not use Einstein notation. Alternatively, you should find an invariant way of expressing that particular component (for example, fixing a distinguished one-form/vector field and write the component as the contraction of your tensor against that one-form or vector field).

Alternatives

Einstein's summation notation is ultimately about pairings between $V$ and $V^*$, so (in spite of its likely origin) you should not think of it primarily as a notation used for decluttering computations of tensor components in local coordinates, but rather as a way to efficiently solve the problem of "which two slots are we contracting again?"

From this point of view the alternatives to Einstein's notation are "invariant notation" (don't use any index; write everything in coordinate free manner) and the "Penrose diagrammatic notation" (see e.g. https://en.wikipedia.org/wiki/Penrose_graphical_notation).

Related Question