When a thing is parallel transported, doesn't it remain in a sense unchanged?
Intuitively the equation
$$
t^a \nabla _a v^b =0
$$
means that vector $v$ is constant (the derivative is zero!) with respect to differentiation along the velocity vector of curve $C$. This velocity vector
$$
t=\dot{C}
$$
provides a natural way of expressing the direction in which vector $v$ is being transported.
Which vector $v$ is meant in your last paragraph? It looks to me that there is a confusion. Wald's definition can be written in terms of derivatives along vectors as well, just slightly different notation is used. Namely,
$$
\nabla_{t} v^b:= t^a \nabla_ a v^b
$$
Notice that Wald uses so called abstract index notation, while other sources that you assume may use "invariant" or "coordinate" ways of writing down that same things.
In what sense is the connection enabling one to compare the vector field at two different points on the manifold [...], when the mapping is from the (Cartesian product of) the set of tangent vector fields to itself? I thought that the connection ∇ "connected" two neighbouring tangent spaces through the notion of parallel transport [...]
To see a connection only as a mapping $\nabla: \mathcal{X}(M)\times\mathcal{X}(M)\rightarrow\mathcal{X}(M)$ is too restrictive. Often a connection is also seen as a map $Y\mapsto\nabla Y\in\Gamma(TM\otimes TM^*)$, which highlights the derivative aspect. However, the important point is that $\nabla$ is $C^\infty(M)$-linear in the first argument which results in the fact that the value $\nabla_X Y|_p$ only depends on $X_p$ in the sense that
$$
X_p=Z_p \Rightarrow \nabla_X Y|_p = \nabla_Z Y|_p.
$$
Hence, for every $v\in TM_p$, $\nabla_vY$ is well-defined. This leads directly to the definition of parallel vector fields and parallel transport (as I think you already know).
Vice versa, given parallel transport maps $\Gamma(\gamma)^t_s: TM_{\gamma(s)}\rightarrow TM_{\gamma(t)}$, one can recover the connection via
$$
\nabla_X Y|p = \frac{d}{dt}\bigg|_{t=0}\Gamma(\gamma)_t^0Y_{\gamma(t)} \quad(\gamma \text{ is a integral curve of }X).
$$
This is exactly the generalisation of directional derivatives in the sense that we vary $Y$ in direction of $X_p$ in a parallel manner.
In Euclidean space this indeed reduces to the directional derivative: Using the identity chart every vector field can be written as $Y_p=(p,V(p))$ for $V:\mathbb R^n\rightarrow \mathbb R^n$ and the parallel transport is just given by
$$
\Gamma(\gamma)_s^t (\gamma(s),v)=(\gamma(t),v).
$$
Hence, we find in Euclidean space:
$$
\frac{d}{dt}\bigg|_{t=0}\Gamma(\gamma)_t^0Y_{\gamma(t)} = \frac{d}{dt}\bigg|_{t=0}(p,V(\gamma(t))) = (p,DV\cdot\gamma'(0)),
$$
which is exactly the directional derivative of $V$ in direction $v=\gamma'(0)$.
Back to the original question: I think it is hard to see how a connection "connects neighbouring tangent spaces" only from the axioms. You should keep in mind, however, that the contemporary formalism has passed many abstraction layers since the beginning and is reduced to its core, the axioms (for a survey see also Wikipedia). To get the whole picture, it is essential that one explores all possible interpretations and consequences of the definition, since often they led to the definition in the first place. In my opinion, the connection is defined as it is with the image in mind that it is an infinitesimal version of parallel transport. Starting from this point, properties as the Leibniz rule are a consequence. However, having such a differential operator $\nabla$ fulfilling linearity, Leibniz rule and so on, is fully equivalent to having parallel transport in the first place. In modern mathematics, these properties are thus taken as the defining properties/axioms of a connection, mainly because they are easier to handle and easier to generalise to arbitrary vector bundles.
Given this, what does the quantity $\nabla_{e_\mu}e_\nu=\Gamma^\lambda_{\mu\nu}e_\lambda$ represent? [...]
As you wrote, the connection coefficients / Christoffel symbols $\Gamma^\lambda_{\mu\nu}$ are the components of the connection in a local frame and are needed for explicit computations. I think on this level you can't get much meaning out these coefficients. However, they reappear in a nicer way if you restate everything in the Cartan formalism and study Cartan and/or principal connections. The Wikipedia article on connection forms tries to give an introduction to this approach.
Nahakara also gives an introduction to connections on principal bundles and the relation to gauge theory later on in his book. In my opinion, this chapter is a bit short and could be more detailed, especially to the end. But it is a good start.
Best Answer
First, let's make sure we understand what a connection is. Let $M$ be a smooth manifold, let $\mathscr{O}(M)$ be its ring of smooth functions (scalar fields), and let $TM$ be its tangent bundle. Let $\Gamma(TM)$ denote the space of vector fields on $M$ (i.e. the $\mathscr{O}(M)$-module of smooth sections of $TM$). A connection on $TM$ is a smooth map $\nabla : \Gamma(TM) \times \Gamma(TM) \to \Gamma(TM)$ satisfying the following properties:
$\nabla$ is $\mathscr{O}(M)$-linear in the first argument: so for vector fields $X, Y, Z$ and smooth functions $f, g$, $$\nabla(f X + g Y, Z) = f \nabla(X, Z) + g \nabla(Y, Z)$$
$\nabla$ is $\mathbb{R}$-linear in the second argument, where (by abuse of notation) $\mathbb{R}$ is the subalgebra of constant functions in $\mathscr{O}(M)$; that is, for any constant $c$ and vector fields $X$ and $Y$, $$\nabla(X, c Y) = c \nabla(X, Y)$$
$\nabla$ obeys the Leibniz rule for the second argument, in the sense that for vector fields $X$ and $Y$ and a smooth function $f$, $$\nabla(X, f Y) = f \nabla(X, Y) + \nabla(X, f) Y$$ where $\nabla(X, f)$ denotes the action of $X$ (as a differential operator) on $f$. (Recall that tangent vectors are defined as equivalence classes of differential operators at a point.)
Alternatively, we might define $\nabla$ as a smooth $\mathbb{R}$-linear map $\Gamma(TM) \to \Gamma(T^*M \otimes TM)$ satisfying certain properties.
It's not hard to show that connections exist: one can be constructed by patching together coordinate differentials using a partition of unity, but since you tagged the question riemannian-geometry, I'll give a specific example of a non-trivial connection, for concreteness. A Riemannian manifold is equipped with a metric $g_{ij}$, and if we impose the additional condition that $\nabla_k g_{ij} = 0$, we obtain a unique connection $\nabla$, called the Levi–Civita connection. It is given in coordinates by the formula $$(\nabla(X, Y))^i = X^j \nabla_j Y^i = X^k \partial_k Y^i + \Gamma^i_{\phantom{i}jk} X^j Y^k$$ where $\Gamma^i_{\phantom{i}jk}$ is the Christoffel symbol, which is defined in coordinates by $$\Gamma^i_{\phantom{i}jk} =\frac{1}{2} g^{il} \left( \partial_k g_{jl} + \partial_j g_{lk} - \partial_l g_{jk} \right)$$ It is a straightforward exercise in symbol-pushing to verify that this does indeed define a connection with the desired properties.