Intuition behind connection 1-forms and Ehresmann connections

connectionsdifferential-geometryfiber-bundlesgauge-theoryprincipal-bundles

I am learning about mathematical gauge theory and so far I have been able to develop an intuition behind all the objects I've read about such as principal bundles, associated bundles, and vertical/horizontal tangent spaces. However I have recently come across Ehresmann connections and connection 1-forms and I have been having trouble seeing the motivation behind them. I understand the technical definition for both but I cannot picture what idea they're trying to capture. Given that there is a one-to-one correspondence between connection 1-forms and Ehresmann connections I assume they both represent the same idea, but I am missing what that idea is. This lack of intuition has made it hard for me to understand further concepts such as curvature 2-forms and the motivation behind those since they're defined in terms of connection 1-forms.

The only motivation I have for connection 1-forms is that they somehow "pick out" the the vertical component of a tangent vector in the principal tangent bundle.

In a lecture given by Frederic Schuller (https://www.youtube.com/watch?v=jFvyeufXyW0,), he says around 16:30 that

the idea of a connection is to make a choice of how to "connect" the individual points of "neighboring" fibers in a principal bundle.

and around 18:10 gives an analogy that if we were to "walk" from one point $x$ to another point $y$ in the base manifold the connection tells us how to associate to a point in the fiber at $x$ a point in the fiber at $y$. I don't see how the definition of a connection 1-form corresponds to this picture.

Can anyone give an intuitive explanation on what motivates connection 1-forms and/or Ehresmann connections? But I am not sure why this is important or of any interest geometrically.

Best Answer

Let us take your description as our primitive idea, namely that a connection should be a method by which “a motion at the bottom induces a corresponding motion on top”. This is what you’ve roughly described by connecting the different fibers in a fiber bundle. Ok, this was already at the level of curves, but usually this is difficult to ‘prescribe’ in practice, so let us look at the ‘infinitesimal level’ first, i.e at the level of tangent spaces. So, our new mantra is ‘a tangent vector at the bottom must induce a corresponding tangent vector on top’. See the picture I drew here (sure I talk about vector bundles there, but that’s not essential).

Slightly more precisely, given a fiber bundle $(X,\pi, M)$, what we want is a map $L:TM\times_MX\to TX$ (the $\times_M$ means consider the fiber bundle over $M$ whose fiber over a point $x\in M$ is $T_xM\times X_x$) which assigns to each tangent vector $k_x\in T_xM$ and each fiber element $\xi_x\in X_x$, a tangent vector $L(k_x,\xi_x)\in T_{\xi_x}X$ such that $T\pi$ projects $L(k_x,\xi_x)$ back down to $k_x$. It’s kind of like if you imagine a tall bulding, and the person $x$ on the ground floor moves a little bit in the direction $k_x$. He then tells each of his upstairs neighbours $\xi_x$ to ‘follow his lead’ and move accordingly by an amount $L(k_x,\xi_x)$. The reason I used the notation $L$ is because this is a ‘lifting map’. It lifts the tangent vector $k_x$ on the base to a tangent vector $L(k_x,\xi_x)$ at height $\xi_x$. The fact that the projection of $L(k_x,\xi_x)$ under $T\pi$ is $k_x$ says that (if $k_x$ is non-zero) the vector $L(k_x,\xi_x)$ is not in the kernel of $T\pi$, i.e it is not tangent to the fiber $X_{x}$, or said in another way, it does not belong to the vertical space $V_{\xi_x}X$.

Now, let us also suppose that $L$ depends linearly in the $k_x$ slot. Then, for each $\xi_x\in X_x$, $L(\cdot,\xi_x)$ maps $T_xM$ linearly and bijectively onto a subspace of $T_{\xi_x}X$, which I shall call $H_{\xi_x}X$. We then have a direct sum decomposition $T_{\xi_x}X=V_{\xi_x}X\oplus H_{\xi_x}X$ (an exercise in linear algebra; what’s the ‘correct’ generalization?). So you can think of $L(\cdot,\xi_x)$ as ‘lifting’ the tangent space $T_xM$ to the height $\xi_x$ (I visualize this process as carrying a plate up a hill).

So you see, by starting with the naive idea of ‘moving in the bottom must induce a corresponding movement on top’, we obtain a direct sum decomposition $TX=VX\oplus HX$ of the tangent bundle $TX$ of the fiber bundle.


Alternatively, you can forget about a choice of lifting map $L$, and directly start with a choice of a decomposition $TX=VX\oplus HX$. Now, why should a choice of complementary subbundle $HX$ intuitively convey information about ‘connections’? Well its pretty obvious. Let $\zeta\in HX$ be non-zero, and for notational concreteness say $\zeta\in H_{\xi_x}X$ for some $\xi_x\in X$ and $x\in M$. Then by definition of being a tangent vector to $X$, it means I can find a smooth curve $\gamma(t)$ which has $\gamma(0)=\xi_x$ and $\dot{\gamma}(0)=\zeta$. The curve $\gamma(t)$ cannot entirely lie in the fiber $X_x$ because then its tangent vector would lie in the vertical space $V_{\xi_x}$. Ok, so this means for small $t$, $\gamma(t)$ will end up in a fiber different from $X_x$ where we started initially. So you see a choice of complementary subbundle (called horizontal subbundle) gives us a way to go from a given fiber to a very closely different fiber. This is all the intuition behind Ehresmann connections. But really, the tell-tale sign that this is a good idea is to look at the picture I linked: ‘clearly’ the green arrow is pointing towards a different fiber :)

To make some of the things I said above precise (namely going from a complementary subbundle $HX$ to getting an actual isomorphism of different fibers, you’d write down an ODE and invoke existence and uniqueness).

Previously, we’ve seen how specifying the ‘lifting map $L$’ gives us a horizontal subbundle. Let us now see the converse. Given a decomposition $TX=VX\oplus HX$, the rank nullity theorem implies that for each $x\in M$ and $\xi_x\in X_x$, the tangent map $T\pi_{\xi_x}:T_{\xi_x}X\to T_xM$ restricts to a linear isomorphism $H_{\xi_x}X\to T_xM$ (since we’re ignoring the kernel which is $V_{\xi_x}X$). The inverse of this isomorphism corresponds exactly to the lifting map $L(\cdot, \xi_x)$ above.

Ok so far I talked about general fiber bundles, but for principal bundles, we would like for our horizontal subspaces to ‘vary consistently’ as our group acts. Recall that the group orbits are the fibers, so if a group element $g$ moves a point $\xi_x$ in the fiber to the point $\xi_xg$, then we’d like for the induced map on tangent spaces to map the horizontal space $H_{\xi_x}X$ to be mapped to $H_{\xi_xg}X$. This is just the obvious thing to require.


Let $(X,\pi,M, G)$ be a principal bundle. Hopefully you’re now happy with the idea that specifying a horizontal subbundle (i.e a complement to $VX$) does indeed correspond to ‘infinitesimally connecting fibers’ (once you draw a picture this becomes almost tautological in hindsight). Of course for a principal connection the horizontal subbundle must vary accordingly with $G$.

First of all I said that directly talking about curves and their lifts is difficult, so we instead formulated things at the tangent space level. But now, specifying a collection of subspaces, and working with them is a little unwieldy, so we look for an alternative way to describe things. This will lead us to the connection 1-form, but first a general observation.

Fact.

Let $(X,\pi,M,G)$ be a principal bundle, $m:X\times G\to X$ the group action map, and let $VX$ be the vertical subbundle. Then, the mapping $\Phi:X\times\mathfrak{g}\to VX$ given by $\Phi(\xi,\gamma):=T\left(m(\xi,\cdot)\right)_e[\gamma]$ is smooth, fiberwise linear and makes the following diagram commute:

$\require{AMScd}$ \begin{CD} X\times\mathfrak{g} @>{\Phi}>> VX \\ @V{\text{pr}_1}VV @VV{\pi|_{VX\to X}}V \\ X @>>{\text{id}_X}> X \end{CD} In other words, $\Phi$ provides a vector bundle isomorphism from the trivial vector bundle $X\times\mathfrak{g}$ over $X$, onto the vector bundle $VX$ over $X$.

In essence, this is saying that if a group element can act on $X$, then by taking derivatives, the Lie algebra elements can also act on $X$.

From here, it’s basically following your nose. Suppose we have a principal connection in the form of a direct sum decomposition $TX=VX\oplus HX$. Let $P_V:TX\to VX$ be the induced projection (warning: even though the notation may indicate otherwise, the definition of $P_V$ depends on $HX$!). Now, we can consider the map $\omega$ defined as the following triple composition: $\require{AMScd}$ \begin{CD} TX @>{P_V} >> VX @>{\Phi^{-1}} >> X\times\mathfrak{g} @>{\text{pr}_2}>> \mathfrak{g}. \end{CD} This is nothing but a Lie-algebra-valued $1$-form on $X$ (why? because for each $\xi\in X$, $\omega$ restricts to a linear map $T_{\xi}X\to \mathfrak{g}$). You can check this has all the properties of the connection 1-form. To recover the subspaces, you simply take the kernel of $\omega$. This completes the link between the two ideas.


Of course I’ve glossed over several details, but that’s what textbooks are for. The only way to appreciate these concepts is to draw your own pictures and make sense of what it is you’re drawing (I personally don’t understand something unless I’ve drawn it myself… even if in hindsight I end up drawing what someone else has already drawn).


Edit: A Summary.

So far we have seen three different ways of describing the same thing. This comes down to the ‘trichotomy’ of descriptions (I’m omitting some details regarding how the group compatibility comes into play):

  • directly: This is the Ehresmann definition of just telling you outright which complementary subbundle $HX$ to choose.
  • ‘parametrically’: this means you look at the image of some other map. In my answer, I described this using the lifting map $L$; its image is what we define $HX$ to be.
  • ‘implicitly’: you take a level set of some map. The connection 1-form approach falls under this category because we define $HX$ to be the kernel of $\omega$. Note that another way of doing things is that we could specify a map $P:TX\to TX$ such that $P\circ P=P$ and its image equals $VX$. Then we can define $HX$ to be the kernel $P$ (with this approach this $P$ will equal what I called $P_V$ above. Then, $I-P$ will be the projection onto $HX$; review in the linear algebraic case if necessary).

So, there are many different ways one can formally describe something, but if you think for a moment, they’re all saying the same thing, just in a different guise. The motivation for all this comes, as always, from the really basic linear algebra case, so if in doubt, one should review the various descriptions there.