[Math] Matrix transponse in tensor notation

conventionmatricestensor-productstensors

In this paper, at the end of chapter 2, the author says that in index notation a matrix is written as $A^\mu_{\;\;\nu}$ and its transpose as $A_\nu^{\;\;\mu}$.

$A^\mu_{\;\;\nu}$ looks like a (1,1)-tensor, but in a tensor the order of indices does not matter.

  • How to explain this discrepancy?

I am still trying to build intuition on tensors, so forgive me if the following questions are stupid.

  • If $x$ is a vector and $A$ the matrix of a linear transformation, how to express $A^Tx$ in tensor notation?
  • The linear algebra concept for a (1,1)-tensor is the matrix of a linear transformation, for a (0,2)-tensor is the matrix of a quadratic form. What it is for a (2,0)-tensor?

Best Answer

The short answer is that the order of the indices does matter, and that is because when you introduce a metric tensor you (or some people) are constantly raising and lowering indices.

A lot of authors say "transpose" when they really mean the adjoint. The adjoint of a map $A$ with respect to a metric $g$ is the linear transformation $A^{\text{Ad}}$ such that for any vectors $v$ and $w$ $$g(A(v),w) = g(v,A^{\text{Ad}}(w))$$

If you express $A^{\text{Ad}}$ by its components with respect to a basis, you can check that $${(A^{\text{Ad}})^{\mu}}_{\nu} = {A^{\alpha}}_{\beta}g^{\mu\beta}g_{\alpha\nu} =: {A_{\nu}}^{\mu}$$ where the $g_{\mu\nu}$ are the components of the metric tensor. That is why I don't particularly like the raising and lowering of indices: it hides the fact that there is a metric tensor involved, and it looks like you just interchanged the horizontal positions of the indices.

Now, if you are working in an orthonormal basis, the components of the metric tensor are $\delta_{\mu\nu}$ (ie. a Kronecker delta), and then you can calculate the adjoint of $A$ by simply interchanging the rows and columns of its matrix representative ${A^{\mu}}_{\nu}$. This operation of "flipping the matrix" came to be known as the transpose, but again it only makes sense when you are using orthonormal coordinates.

The point here is that the concept you should be looking for is the adjoint of a linear transformation, and it only reduces to the "transpose" if you take its components with respect to an orthonormal frame.

For more details you can check my answer to this question, where I treat exactly this kind of issues.


Okay, by request of @MathAsFun, I will add an example.

Let $V$ be an $n$-dimensional vector space with metric $g\in T^{0,2}V$, and take a linear map $\phi:V\to V$. We now choose an orthonormal basis $\{e_{\mu}\}_{\mu\in I_{n}} \subseteq V$ (Where $I_n$ stands for the set $\{1,\dots,n\}$).

We get the components $g_{\mu\nu}$ of the metric $g$ by aplying it to the basis pairwise (i.e, $g_{\mu\nu} := g(e_{\mu},e_{\nu})$), and since the basis is orthonormal, then $$g_{\mu\nu} = \delta_{\mu\nu} := \begin{cases}1 & \mu = \nu \\ 0 & \mu \neq \nu \end{cases}$$

The components of $\phi^{\text{Ad}}$ are related to those of $\phi$ by $${(\phi^{\text{Ad}})^{\mu}}_{\nu} = {\phi^{\alpha}}_{\beta}g^{\mu\beta}g_{\alpha\nu}$$

Okaaaaaay. Now, for concreteness, let's say $n = 2$. So we proceed to calculate the components ${(\phi^{\text{Ad}})^{\mu}}_{\nu}$ $$\begin{align} {(\phi^{\text{Ad}})^{1}}_{1} &= {\phi^{\alpha}}_{\beta}g^{1\beta}g_{\alpha1}\\ &= {\phi^{1}}_{1}g^{11}g_{11} &= {\phi^{1}}_{1} \end{align}$$

$$\begin{align} {(\phi^{\text{Ad}})^{1}}_{2} &= {\phi^{\alpha}}_{\beta}g^{1\beta}g_{\alpha2}\\ &= {\phi^{2}}_{1}g^{11}g_{22} &= {\phi^{2}}_{1} \end{align}$$

$$\begin{align} {(\phi^{\text{Ad}})^{2}}_{1} &= {\phi^{\alpha}}_{\beta}g^{2\beta}g_{\alpha1}\\ &= {\phi^{1}}_{2}g^{22}g_{11} &= {\phi^{1}}_{2} \end{align}$$

$$\begin{align} {(\phi^{\text{Ad}})^{2}}_{2} &= {\phi^{\alpha}}_{\beta}g^{2\beta}g_{\alpha2}\\ &= {\phi^{2}}_{2}g^{22}g_{22} &= {\phi^{2}}_{2} \end{align}$$

If you write the components ${\phi^{\mu}}_{\nu}$ and ${(\phi^{\text{Ad}})^{\mu}}_{\nu}$ as matrices, you can see that one really is the transpose of the other. Again, as you can see from the calculations, this only holds in the case of an orthonormal basis. Otherwise, in the summation you would get non-zero off-diagonal terms for the components of the metric, ($g_{12}$ and $g_{21}$) or diagonal terms ($g_{11}$ and $g_{22}$) different of 1, and that of course would destroy this property.