"Rolling without slipping" is a powerful idea, but the phrase doesn't necessarily lead one to the intended mental model. In particular,
torsion is something that is at issue only for manifolds of dimension 3 or higher. Perhaps you can imagine taking a 3-manifold, and rolling it along a hyperplane in 4-space --- but
the metaphor becomes strained, partly because most Riemannian 3-manifolds cannot be
smoothly isometrically embedded in $\mathbb E^4$.
Another way to think of it is this: suppose you have a smooth parametrized
curve say in a Riemannian
3-manifold, $\alpha: [0,T] \rightarrow M^3$. Then the claim is that there exists
exists a matching curve $\beta: [0,T] \rightarrow \mathbb E^3$ together with a map
$\phi$ from a neighborhood of the image of $\beta$ to a neighborhood of the image of $\alpha$
that takes the Euclidean metric to the metric of $M^3$ up to first order along the curve.
Furthermore, $\beta$ is uniquely determined up to an isometry of $\mathbb E^3$.
Basically, $\beta$ is what you get if you "roll" $M^3$ along $\mathbb E^3$ along $\gamma$
in a way
that best maintains contact between the two spaces: that is, "without slipping".
To see that the curve $\beta$ (assuming it exists) is uniquely determined by $\alpha$,
we can imagine trying to send a neighborhood of $\beta$ to a neighborhood of another
curve $\gamma: [0,T] \rightarrow \mathbb E^3$ in a way that maintains first order
contact of the metric. If you look at curves parallel to $\beta$ in $\mathbb E^3$, the first
derivative of their arc length
is negative in the direction that $\beta$ is curving. The logarithmic derivative of
arc length, for curves displaced along a normal vector field that remains as parallel to
itself as possible, is the magnitude of the curvature.
If you try to twist the normal coordinate system, this corresponds to the concept
of torsion. It's easiest to visualize along a straight line: if you twist a neighborhood
of a line in space,
you distort the metric on each concentric cylinder, by changing the angles between
cross-section circles and generating lines. I.e., threads that wind around a hose
at angles $\pm \pi/4$ are effective at preventing twisting (= torsion). The same
principle holds for any curve in space: the first order behavior of the metric in a
neighborhood of the curve locks in the Frenet characterization of the curve (well,
the curvature and torsion as a function of arc length, but these
are different from but related to the curvature and torsion of a connection).
Why does the matching curve exist? You can check derviatives etc. but
better to just imagine it. Basically, you could reparametrize $\alpha$ by arc length,
then project a neighborhood of $\alpha$ back to $\alpha$ by sending each point to the closest
point of $\alpha$, and parametrize the lines of projection by their arc length.
On each concentric tube, there's a unique unit vector field orthogonal to the preimages
of projection to $\alpha$. Scale this vector field so that it commutes with
projection, to get a full set of
cylindrical coordinates for a neighborhood of $\alpha$. The only first-order
invariant for the metric that is free is the first derivative of scaling function.
Using that, you can match the first derivative
by using the curvature and torsion of a curve in space.
This process defines the affine connection on the tangent bundle. The Levi-Civita
connection is the linear part of the affine connection, which is automatically by
definition torsion free. The non-torsion-free connections are ones that
impart twists on little neighborhoods of curves. This is usually expressed
by translating it into a formula about covariant derivatives of two vector fields
not being as commutative as it should be.
This really calls for pictures. Any volunteers?
Interesting question. The answer is no: surfaces with bounded geometry can have normal
bundles with unbounded curvature.
To set the stage, it's worth first noting that you can have a surface with extreme geometry isometrically embedded in $\mathbb E^3$, where the normal bundle, being one-dimensional, has a trivial connection; or include this into $\mathbb E^4$ (Euclidean 4-space) where the
normal bundle is 2-dimensional, but the curvature is still 0. This at least illustrates that bounded geometry of the normal bundle and tangent bundle are decoupled.
I'll now describe an isometric embedding of the $\mathbb E^2$
into $\mathbb E^6$ where the connection
on the normal bundle has unbounded curvature. The embedding have local 1-parameter groups of symmetry, which makes it easier to keep track of curvature without needing to write down equations.
Start by visualizing
a helical curve in $\mathbb E^3$. The tangent vector to a helix goes repeatedly around a circle in its spherical image. The connection on the tangent bundle is induced by this
Gauss map from the connection on $S^2$, so the parallel translation of the normal bundle rotates the plane by an angle equal to the area enclosed inside this circle once every coil of the helix.
Now consider a similar curve in $\mathbb E^5$, thought of as $\mathbb E \times \mathbb E^2 \times \mathbb E^2$. In the $\mathbb E$-direction, the curve makes uniform progress, while going around circles of possibly different radii at possibly different rates in the $\mathbb E^2$ directions. If the term weren't otherwise engaged, one could call this a double helix.
It is invariant by a 1-parameter group of isometries of $\mathbb E^5$ that translates in
the $\mathbb E$ direction while spinning the two perpendicular planes at their own rates.
The normal bundle splits into two $\mathbb E^2$ subbundles, its intersection with
the two 3-dimensional $\mathbb E \times \mathbb E^2$'s. The connection preserves this splitting, rotating the two $\mathbb E^2$'s indendently.
Now add an extra "parameter" dimension, making the ambient space $\mathbb E^6$.
Modify the curve in $\mathbb E^5$ by increasing the radius of one helix while decreasing the
radius of the other, balancing the changes so the curve remains invariant by the same
1-parameter group, and its arc length remains constant (as measured by the time parameter of the 1-parameter group). It's easy to see, since it's
locally isometric to a surface of revolution because of the symmetry, that the
resulting surface is isometric to $\mathbb E^2$. We can make the circle in on
$\mathbb E^2$ go all the way to 0. After making sure it has $C^\infty$ contact to
a straight line in this projection, we can then start making this projection helical
again, but with a
steeper and tighter helix, adjusting by letting the other helical projection shrink to a line. The local symmetry group in $\mathbb E^6$ has changed, but the induced symmetry on
the surface remains the same.
We can go back and forth, alternating between helical effects in the two factors,
inexorably tightening the screws without distoring the surface.
The curvature of the induced connection on the normal bundle becomes arbitrarily high,
as you can see by following the connection around a small rectangle, with two edges
in the "parameter" direction, one edge where say the first helix has become straight
and the fourth edge where the first helix is wound in very small tight coils.
Best Answer
Here is another way of obtaining the Christoffel symbols with the symetry imposed by the torsion free condition
$$ \Gamma^i_{k\ell}=\Gamma^i_{\ell k}. $$
This goes back to Riemann's Habillitation.
Suppose that $(M,g)$ is a Riemann manifold of dimension $N$, $p\in M$. By fixing an orthonormal frame of $T_pM$ we can find local coordinates $(x^1,\dotsc, x^N)$ near $p$ such that, $\newcommand{\pa}{\partial} $
$$ x^i(p)=0, \;\; g=\sum_{i,j} g_{ij}(x) dx^i dx^j, $$
$$g_{ij}(x)= \delta_{ij} +\sum_{i,j}\left(\sum_k\pa_{x^k}g_{ij}(0) x^k\right) dx^i dx^j + O(|x|^2). $$
In other words, in these coordinates,
$$ g_{ij}(x)=\delta_{ij} +O(|x|). $$
Riemann was asking whether one can find new coordinates near $p$ such that in these coordinates the metric $g$ satisfies $g_{ij}=\delta_{ij}$.
As a first step, we can ask whether we can find a new system of coordinates such that, in these coordinates the metric $g$ is described by
$$ g=\sum_{ij}\hat{g}_{ij} dy^idy^j, $$
where
$$\hat{g}(y)=\delta_{ij}+ O(|y|^2). \tag{1} $$
The new coordinates $(y^j)$ are described in terms of the old coordinates $(x^i)$ by a family of Taylor approximations
$$y^j= x^j + \frac{1}{2}\sum_{ij}\gamma^j_{\ell k} x^\ell x^k + O(|x|^3),\;\; \gamma^j_{\ell k}=\gamma^j_{k\ell}. $$
The constraint (1) implies
$$ \gamma^j_{\ell k}=\frac{1}{2}\left(\pa_{x^\ell}g_{jk}+\pa_{x^\ell}g_{jk}-\pa_{x^j}g_{\ell k}\right)_{x=0}. $$
We see that, in the $x$ coordinates
$$ \Gamma^i_{k\ell}(p)=\gamma^i_{k\ell}, $$
because $g^{ij}(p)=\delta^{ij}$.
It took people several decades after Riemann's work to realize that the coefficients $\Gamma^i_{k\ell}$ are related to parallel transport, and ultimately, to a concept of connection.
Ultimately, to my mind, the best explanation for the torsion-free requirement comes from Cartan's moving frame technique. The clincher is the following technical fact: given a connection $\nabla$ on $TM$ and a $1$-form $\alpha\in \Omega^1(M)$ then for any vector fields $X,Y$ on $M$ we have
$$d\alpha(X,Y)= X\alpha(Y)-Y\alpha(X)-\alpha([X,Y]) $$
$$= (\nabla_X\alpha)(Y)-(\nabla_Y\alpha)(X)+\alpha(\nabla_XY-\nabla_YX)-\alpha([X,Y]) $$
$$= (\nabla_X\alpha)(Y)-(\nabla_Y\alpha)(X)+\alpha\bigl(\;T_\nabla(X,Y)\;\bigr). $$
If the torsion is zero, the above equality looses a term, and one obtains rather easily Cartan's structural equations of a Riemann manifold.