Yes.
There is something very important (I would say it is the most important thing) to consider: if $V$ is a vector space, then there are two distinct types of objects that can be represented with a matrix.
- Bilinear forms. I.e, elements of $T^{0,2}V$ or $T^{2,0}V$. The first ones are maps from ${V}^{\times 2} \to \mathbb{R}$, and the second ones are maps from ${V^{*}}^{\times 2}\to\mathbb{R}$
- Endomorphisms. I.e, elements of $T^{1,1}V$. These objects are maps from $V\times V^{*}\to \mathbb{R}$. However, the important thing here is that this space is isomorphic to $\text{Hom}(V,V)$.
Well, having said this, it would be very reasonable to think that the concept of the "transpose of a matrix" will do different things on each of these objecs. And that is indeed the case.
First, as someone mentioned in the comments, if $\phi\in T^{0,2}V$ (say) then we define its transpose as the operation of braiding its slots. This means that for all vectors $v,w\in V$:
$$\phi^{T}(v,w) = \phi(w,v)$$
Using abstract index notation, this can be written as:
$$(\phi^T)_{ab} := \phi_{ba}$$
The same can be done anologously with an element of $T^{2,0}V$. Note that this definition does not need additional structure. It is a canonical operation in any vector space.
Okay, that was easy. Now for $\phi\in T^{1,1}V$. In this case, there is no canonical identification available. However, if we introduce a metric $g$ in our vector space, we can define the adjoint of $\phi$ with respect to $g$ (denoted $\phi^{\text{Ad}_g}$) as the unique map such that for all vectors $v,w \in V$
$$g(v,\phi(w)) = g(\phi^{\text{Ad}_g}(v),w)$$
If you make the calculation in abstract index notation, you can see this reduces to:
$${(\phi^{\text{Ad}_g})^{a}}_{b} = {\phi^{c}}_{d}g^{ad}g_{cb}$$
Now, it is noteworthy that a lot of confusion arises from the "raising and lowering of indices" notation (${\phi^{c}}_{d}g^{ad}g_{cb} = {\phi_{b}}^{a}$)
since the previous definition reduces to
$${(\phi^{\text{Ad}_g})^{a}}_{b} = {\phi_{b}}^{a}$$
This, obviously, is the justification for keeping the horizontal spacing of the indices: the order of the indices does matter.
However, this notation kind of hides the fact that there is a metric involved, and it is by this reason I don't like it a lot.
Well, those are the abstract definitions.
If you choose an arbitrary basis for your vector space $V$ and write the components of a bilinear form, you can see that the abstract operation of braiding its entries corresponds to interchanging its rows and columns.
If you choose an orthonormal basis for your inner product space $V$ and write the components of an endomorphism, you can see that the abstract operation of the taking the adjoint with respect to the metric corresponds to interchanging its rows and columns.
By using matrices, one does not see the differences between these two objects, and normally one cannot see the difference between the two concepts that give rise to the "transpose of a matrix" (namely, braiding vs adjointness).
This is a similar issue to that of the determinant: $\det(\phi)$ is only an invariant with respect to changes of basis if $\phi$ is an endomorphism. One can take the (quote) "determinant" (unquote) of a bilinear form by means of performing that very well known recursive algorithm on the entries of the representing matrix, but the resulting scalar depends on the choice of basis for the vector space.
The linked post comes from Physics.SE and, in physics, the distinction between indices which label the entry in a multi-dimensional array and abstract indices is not always made.
In the first case, we are only dealing with an equality between matrices, which happens to hold in any basis. This is possible because the two spaces $V^*\otimes W$ and $W\otimes V^*$ are canonically isomorphic.
To deal with this in abstract index, we take the convention that permutations of indices represent the corresponding braiding maps.
Given $n$-vector spaces $V_1\otimes \ldots \otimes V_n$ and a permutation $\sigma \in\mathfrak S_n$, there is a natural braiding map :
$$\tau_\sigma : V_1\otimes \ldots\otimes V_n \to V_{\sigma(1)}\otimes \ldots V_{\sigma(n)}$$
If $T \in V\otimes W$ and $R \in W\otimes V$, then $T_{ab} = R_{ba}$ means $T = \tau_{(12)}R$.
In our case, we have $A^T = \tau_{(12)}A$ with $\tau_{(12)}$ the braiding map $W\otimes V^*\to V^*\otimes W$.
Best Answer
$ \def\a{\alpha}\def\b{\beta}\def\g{\gamma}\def\t{\theta} \def\l{\lambda}\def\s{\sigma}\def\e{\varepsilon} \def\o{{\tt1}}\def\p{\partial} \def\A{{\cal A}}\def\B{{\cal B}}\def\C{{\cal C}} \def\E{{\cal E}}\def\F{{\cal F}}\def\G{{\cal G}} \def\L{\left}\def\R{\right}\def\LR#1{\L(#1\R)} \def\trace#1{\operatorname{Tr}\LR{#1}} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} $Define the single and double contraction products between tensors as $$\eqalign{ \F &= \A\cdot\B \quad&\implies\quad &F_{ij\ell ps}&=\sum_{k=\o}^n \A_{ij\c{k}}\B_{\c{k}\ell ps} \\ \G &= \A:\B \quad&\implies\quad &\G_{i\ell ps}&=\sum_{j=\o}^m\sum_{k=\o}^n \A_{i\c{jk}}\B_{\c{jk}\ell ps} \\ }$$ Now consider a fourth-order tensor whose components (in terms of Kronecker delta symbols) are $$\E_{ijk\ell} = \delta_{ik}\delta_{j\ell}$$ This tensor is the identity with respect to the double contraction product. Further, it can be used to rearrange ordinary matrix products, i.e. $$\eqalign{ &A = \E:A = A:\E \\ &A\cdot B\cdot C = \LR{A\cdot\E\cdot C^T}:B \\ }$$ Applying these ideas to the product in question yields $$\eqalign{ C &= A\cdot B \\ dC &= dA\cdot B = \LR{\E\cdot B^T}:dA \\ \grad{C}{A} &= \E\cdot B^T \\ }$$ In component form, this becomes $$\eqalign{ \grad{C_{ij}}{A_{k\ell}} = \sum_{p=\o}^n \E_{ijk\c{p}} B_{\c{p}\ell}^T = \sum_{p=\o}^n \delta_{ik}\delta_{j\c{p}} B_{\ell\c{p}} = \delta_{ik} B_{\ell{j}} \\\\ }$$
Note that if you are working in a flat space (which is the case for most engineering and business uses of multidimensional arrays), there is no need to distinguish between covariant/contravariant components.
Therefore you can use a simplified notation wherein all indices are written as subscripts. And the Einstein convention applies to any repeated subscript, e.g. $$ A_{ij}B_{jk} \quad\implies\quad \sum_{j=\o}^n A_{i\c{j}}B_{\c{j}k} $$