First I'd like to share my understanding of abstract index notation. I think this understanding is simpler and more intuitive than Penrose and Rindler's original definition. Your question will be answered later with an example.
Abstract index notation is merely a labelling of the "slots" of the tensor. For example, $T_{ab}{}^c$ is just an abbreviation of
$$T(-_a, -_b, -^c).$$
Each "slot" is a parameter as the tensor is viewed as a multilinear map $T:V\times V \times V^* \to \mathbb R$.
You may be already familiar with the labelling slots interpretation. But what does "labelling" a slot exactly mean? Here is my understanding: it means we can fill a specific slot with a vector (or dual vector) by specifying the label of the slot. For example, if we fill the slot labelled $a$ with a vector $u$, and fill the slot labelled $b$ with a vector $v$, and fill the slot labelled $c$ with a dual vector $\omega$, we get $T(u, v, \omega)$, that is
$$
T(-_a, -_b, -^c)(\{a=u, \; b=v, \; c=\omega\}) = T(u, v, \omega).
$$
Note here a set-like notation $\{a=u, \; b=v, \; c=\omega\}$ is used meaning that the order is irrelevant, $\{b=v, \; a=u, \; c=\omega\}$ and $\{a=u, \; b=v, \; c=\omega\}$ are just the same.
There are two observations from this definition of filling slots:
The position order of the slots is significant.
$$S_{ab} \neq S_{ba}$$
in the sense that
$$
S(-_a, -_b)(\{a=u, \; b=v\}) = S(u, v) \neq S(v, u) = S(-_b, -_a)(\{a=u, \; b=v\}).
$$
An index letter can be substituted with any Latin letter, since it's just a label for the slot. For example $T_{ab} = S_{ab}$ implies $T_{cd} = S_{cd}$. Because $T_{ab} = S_{ab}$ means for any vector $u$ and $v$
$$
T(-_a, -_b)(\{a=u, \; b=v\}) = S(-_a, -_b)(\{a=u, \; b=v\}),
$$
that is
$$
T(u, v) = S(u, v).
$$
And $T_{cd}=S_{cd}$ is equivalent to $T(u, v) = S(u, v)$ too. Note this index substitution is different from index reordering in observation 1, index substitution must be applied on both sides of an equation, We can't exchange $a$ and $b$ from only one side of the equation $S_{ab}=S_{ab}$ to get $S_{ab}=S_{ba}$.
Now we can use abstract index notation to denote tensor product, and contraction operation in a coordinate-free way. For example $U_{abcd} = T_{ab}S_{cd}$ denotes the tensor product
$$
U(-_a, -_b, -_c, -_d) = T(-_a, -_b) \cdot S(-_c, -_d).
$$
And $T_{ae}{}^{ed}$ denotes the contraction with respect of slots $b$ and $c$ of $T_{ab}{}^{cd}$
$$
T_{ae}{}^{ed} = C_b{}^c(T_{ab}{}^{cd}) =
\sum_{\sigma}T(-_a, \frac{\partial}{\partial x^\sigma}, \mathrm dx^\sigma, -^d).
$$
Another important operation is (partial) application, but since it's equivalent to a tensor product followed by a contraction, there is no need to introduce a new notation. For example applying a vector $u$ to the slot $a$ of $T_{ab}$ is
$$
T(u, -_b) = T_{ab}u^a.
$$
where $T_{ab}u^a$ is a tensor product of $T$ and $u$ followed by a contraction: $C_a{}^c(T(-_a, -_b)\cdot u(-^c))$. The result has one free slot left, so it's a (0, 1) tensor. That is, a (0, 2) tensor partially applied with a vector is a (0, 1) tensor.
Example
Consider the following problem: suppose
$$T_{abc} = \mathrm dx^\sigma_a \mathrm dx^\mu_b \mathrm dx^\nu_c,$$
where $\sigma$, $\mu$, and $\nu$ are 3 concrete numbers, then what is $T_{abc} + T_{bca}$?
This example more or less answers your question: what is the practical difference between abstract index notation and “ordinary” index notation. Abstract index notation is easier to read and understand especially when abstract indices and concrete indices are mixed.
In abstract index notation, there is a convention that abstract indices use Latin letters while concrete indices use Greek letters. So in this example we can easily see that $a$, $b$, and $c$ are abstract indices while $\sigma$, $\mu$, and $\nu$ are concrete indices.
The notation $\mathrm dx^\sigma_a$ is not very common, but it makes sense. The dual vector $\mathrm dx^\sigma$ is naturally a function that can act on a vector. The equation $T_{abc} = \mathrm dx^\sigma_a \mathrm dx^\mu_b \mathrm dx^\nu_c$ which is just an abbreviation of
$$
T(-_a, -_b, -_c) = \mathrm dx^\sigma(-_a) \cdot \mathrm dx^\mu(-_b) \cdot \mathrm dx^\nu(-_c)
$$
means when slot $a$ is filled with a vector $u$, $\mathrm dx^\sigma$ will act on that $u$.
To solve the problem, use index substitution, we can get $T_{bca} = \mathrm dx^\sigma_b \mathrm dx^\mu_c \mathrm dx^\nu_a$. So
$$
\begin{aligned}
T_{abc} + T_{bca} &= \mathrm dx^\sigma_a \mathrm dx^\mu_b \mathrm dx^\nu_c +
\mathrm dx^\sigma_b \mathrm dx^\mu_c \mathrm dx^\nu_a \\
&= \mathrm dx^\sigma_a \mathrm dx^\mu_b \mathrm dx^\nu_c +
\mathrm dx^\nu_a \mathrm dx^\sigma_b \mathrm dx^\mu_c \\
&= (\mathrm dx^\sigma \otimes \mathrm dx^\mu \otimes \mathrm dx^\nu + \mathrm dx^\nu \otimes \mathrm dx^\sigma \otimes \mathrm dx^\mu)(-_a, -_b, -_c).
\end{aligned}
$$
This is quite straightforward. On the other hand, if you use concrete index notation to solve this problem, first you need to figure out that the components of $T$ are all zero except
$$T_{\xi\eta\zeta} = 1, \text{when}\; \xi=\sigma, \eta=\mu, \zeta=\nu.$$
Or $T_{\sigma\mu\nu}=1$. But what is $T_{bca}$? $T_{\mu\nu\sigma}$? No. You need to define another tensor $S_{\xi\eta\zeta}=T_{\eta\zeta\xi}$, and figure out that its components are all zero except
$$
S_{\xi\eta\zeta} = 1, \text{when}\; \xi=\nu, \eta=\sigma, \zeta=\mu.
$$
Then finally find the sum $T_{\xi\eta\zeta} + S_{\xi\eta\zeta}$. This procedure is quite complex and error-prone.
If you'd like to translate the equation
$$
T_{abc} = \mathrm dx^\sigma_a \mathrm dx^\mu_b \mathrm dx^\nu_c
$$
to concrete index notation
$$T_{\xi\eta\zeta} = \mathrm dx^\sigma_\xi \mathrm dx^\mu_\eta \mathrm dx^\nu_\zeta.
$$
It doesn't help much. Now $\mathrm dx^\sigma_\xi$ is a tensor whose components are all zero except
$$
\mathrm dx^\sigma_\xi = 1, \text{when}\; \xi=\sigma.
$$
You still need to concern about components. This is not natural. And 6 indices are mixed together, 3 of them are fixed numbers, very confusing.
This is a terrific question. I will try to answer all of your concerns. In general, you should be aware of a few things. First, when we say things like $g_{ij}$, $i$ and $j$ are just placeholders for integers. So it doesn't matter what letters are in the subscript, they mean the same thing. Second, not all tensors can be represented as matrices. Only tensors with two indices have a matrix representation.
The metric tensor does define a geodesic distance, but this is not its only purpose, or even its primary purpose. The main idea of the metric tensor is to show how the coordinates on a manifold relate to each other at each point in the manifold. In some manifolds, like Euclidean space or a cylinder, each point is essentially the same, so the metric tensor is constant, but in most manifolds, the metric tensor is a function (called a tensor field) over the manifold, which depends on the coordinates of the manifold.
The example you gave of the metric tensor in Euclidean space doesn't seem quite right. Specifically, I notice that the indices $k$ and $l$ appear only once in the equation (each index should appear twice in a tensor equation). This equation does look like the formula for converting the metric tensor from one coordinate system to another, which is $$g_{ij}=\frac{\partial x^k}{\partial x^i}\frac{\partial x^l}{\partial x^j}g_{kl}.$$ Here, we have a coordinate system for a 2-manifold characterized by coordinates $x_k$ and $x_l$ with their corresponding metric tensor $g_{kl}$. This formula shows how to convert to the metric tensor for the coordinate system characterized by coordinates $x_i$ and $x_j$. This is like changing from Euclidean to spherical or cylindrical coordinates in $\mathbb R^3$.
Your real question in your second paragraph is how to represent a metric tensor as a matrix. This is quite easy. If your manifold has $n$ dimensions (and thus $n$ coordinates), the metric tensor can be represented by an $n\times n$ matrix where the element of the matrix in the $i^{\text{th}}$ row and $j^{\text{th}}$ column is $g_{ij}$. In $\mathbb R^2$, the matrix representation is $$g_{ij}\doteq\left(\begin{array}{cc} 1 & 0\\ 0 & 1\\ \end{array}\right).$$ You may wonder where this came from. The simplest way to explicitly determine the elements of the metric tensor, and a favorite method in general relativity, is to think about the line element. The line element refers to the infinitesimal distance along a path with respect to the coordinate system. In Euclidean space this is easy because we have the Pythagorean Theorem: $$ds^2=dx^2+dy^2.$$ The line element (called $ds^2$; think of the squared as part of the symbol) is the amount changed in $x$ squared plus the amount changed in $y$ squared. In general, a line element for a 2-manifold would look like this: $$ds^2=g_{11}dx^2+g_{12}dx\,dy+g_{22}dy^2.$$ (notice that the metric tensor is always symmetric, so $g_{12}=g_{21}$.) The terms that involve change along more than one coordinate are called off-diagonal terms because they correspond to off-diagonal elements in the matrix representation. Notice that in Euclidean space, there are no off diagonal terms, so the corresponding matrix is diagonal. Since in Euclidean space, $dx^2$ and $dy^2$ both have coefficients of $1$ in the line element, there are ones in the diagonal.
First of all, if a tensor has more that two indices, than it cannot be represented by a matrix. Of course, we could represent it as a "higher dimensional box of numbers," but then writing things down on a two dimensional piece of paper gets tricky, which is why we have things like Einstein notation. Nevertheless, we shall press on. To do so, we must think of matrices a linear operators. A matrix equation like $Ax=b$ must be read as "$A$ acts on $x$ to give $b$. In this way, the matrix acts on one vector and returns another vector. But where do these vectors come from? They come from the tangent space at a point in the manifold. Whenever we have a tangent space, there is a cotangent space (or dual space) to go with it. A tensor like $A^{ij}$ has a matrix representation which acts on a covector to give another covector. A mixed variance tensor like $R^i_j$ acts on a covector to give a vector or acts on a vector to give a covector. As for the numbers of rows and columns, they should always be $n$ (and $n=4$ in general relativity).
The inverse metric is represented quite literally as the inverse matrix of the metric representation. As for the canceling, this is the same as a reduction in the number of indices when row vector is multiplied by a matrix. The elements that look like $a_{ijkl}$ are multiplied by elements from the inverse metric and added together to get something called $a_{ik}$. Again, $R_{ijkl}$ is not represented by a matrix.
The Christoffel symbol is not a tensor (notice it is not called the Christoffel tensor), but it could still be represented by a "3D box of numbers." The matrices $g_{jl}$, $g_{il}$, and $g_{ij}$ are all the same, but when we assign specific values to $i$, $j$, and $l$, these terms reference different elements of the matrix. Each element $g_{jl}$ is a function of $x^i$ for each coordinate $x^i$. The derivative of $g_{jl}$ with respect to $x_i$ is the standard partial derivative of the $jl$ element of the metric tensor with respect to the $i^\text{th}$ coordinate.
Don't get too hung up on numbers and matrices. Relativists realized a long time ago that this way of thinking is not helpful. Things like Einstein notation are helpful for simplifying calculations. You ask about how elements are affected when indices are changed or cancelled. Just write down the tensor equation in Einstein notation with the appropriate sums and see what happens. It may concern you now, but soon you will not worry so much. Einstein notation is very reliable and won't lead you astray.
I hesitate to give examples in terms of numbers because such calculations are usually very long. In my first relativity class, we were given a metric tensor, and asked to calculate the Ricci tensor, so we first had to calculate Christoffels, then Riemann, then Ricci. I think I used six sheets of paper front and back. But I just did the calculations the way the Einstein notation directs, and it turned out alright. My professor said that he gave us that assignment so we would appreciate the hard work of early relativists. After that, he allowed us to use a computer algebra system.
Edit:
One more thing. If you tried to do the calculations using matrices and "higher dimensional boxes," you would essentially be doing Einstein notation anyway.
Best Answer
Yes.
There is something very important (I would say it is the most important thing) to consider: if $V$ is a vector space, then there are two distinct types of objects that can be represented with a matrix.
Well, having said this, it would be very reasonable to think that the concept of the "transpose of a matrix" will do different things on each of these objecs. And that is indeed the case.
First, as someone mentioned in the comments, if $\phi\in T^{0,2}V$ (say) then we define its transpose as the operation of braiding its slots. This means that for all vectors $v,w\in V$: $$\phi^{T}(v,w) = \phi(w,v)$$ Using abstract index notation, this can be written as: $$(\phi^T)_{ab} := \phi_{ba}$$
The same can be done anologously with an element of $T^{2,0}V$. Note that this definition does not need additional structure. It is a canonical operation in any vector space.
Okay, that was easy. Now for $\phi\in T^{1,1}V$. In this case, there is no canonical identification available. However, if we introduce a metric $g$ in our vector space, we can define the adjoint of $\phi$ with respect to $g$ (denoted $\phi^{\text{Ad}_g}$) as the unique map such that for all vectors $v,w \in V$ $$g(v,\phi(w)) = g(\phi^{\text{Ad}_g}(v),w)$$ If you make the calculation in abstract index notation, you can see this reduces to: $${(\phi^{\text{Ad}_g})^{a}}_{b} = {\phi^{c}}_{d}g^{ad}g_{cb}$$
Now, it is noteworthy that a lot of confusion arises from the "raising and lowering of indices" notation (${\phi^{c}}_{d}g^{ad}g_{cb} = {\phi_{b}}^{a}$) since the previous definition reduces to $${(\phi^{\text{Ad}_g})^{a}}_{b} = {\phi_{b}}^{a}$$ This, obviously, is the justification for keeping the horizontal spacing of the indices: the order of the indices does matter. However, this notation kind of hides the fact that there is a metric involved, and it is by this reason I don't like it a lot.
Well, those are the abstract definitions.
If you choose an arbitrary basis for your vector space $V$ and write the components of a bilinear form, you can see that the abstract operation of braiding its entries corresponds to interchanging its rows and columns.
If you choose an orthonormal basis for your inner product space $V$ and write the components of an endomorphism, you can see that the abstract operation of the taking the adjoint with respect to the metric corresponds to interchanging its rows and columns.
By using matrices, one does not see the differences between these two objects, and normally one cannot see the difference between the two concepts that give rise to the "transpose of a matrix" (namely, braiding vs adjointness).
This is a similar issue to that of the determinant: $\det(\phi)$ is only an invariant with respect to changes of basis if $\phi$ is an endomorphism. One can take the (quote) "determinant" (unquote) of a bilinear form by means of performing that very well known recursive algorithm on the entries of the representing matrix, but the resulting scalar depends on the choice of basis for the vector space.