Conjugate Momenta – Geometric View and Legendre Transformation

classical-mechanicsdifferential-geometryhamiltonian-formalismlagrangian-formalismmomentum

I'm familiar with the ''coordinate view'' of Lagrangian and Hamiltonian mechanics where if $\pmb{q}=(q^1,\dots, q^n)\in\mathbb{R}^n$ are any $n$ generalized coordinates and $L(\pmb{q},\dot{\pmb{q}})$ is the Lagrangian for whatever system we care about, then we define the conjugate momenta (coordinates) as $\pmb{p} = (p_1,\dots, p_n)= \frac{\partial L}{\partial \dot{\pmb{q}}}$ and the Hamiltonian as $H(\pmb{q},\pmb{p}) = p_i \dot{q}^i – L$ (with the RHS expressed in terms of $(\pmb{q},\pmb{p})$).

But I am recently learning the geometric view of things where $(\pmb{q},\dot{\pmb{q}})$ is just one of infinite possible coordinate representations of $(\text{x},\mathbf{v})\in TQ$, where $\text{x}\in Q$ and $\mathbf{v}_{\text{x}}= \frac{\text{d}}{\text{d}t}\text{x}\in T_{\text{x}}Q$, where $Q$ is some n-dimensional configuration manifold, and $L\in \mathcal{F}(TQ)$ a smooth function on the tangent bundle. The momenta is then a 1-form defined as some fiber derivative or Frechet derivative (I don't know what those words mean) as:

$$
\mathbf{p} := \mathbf{d}_2 L(\text{x},\mathbf{v}) \; \in \Omega^1(Q)
\tag{1}
$$

where $\mathbf{d}_2$ is differential with respect to the second argument. The coordinate representation of the above is simply the familiar
$$
p_i = \frac{\partial L}{\partial \dot{q}^i}.
\tag{2}
$$

Question:
So equation (1) defines the actual momenta, a 1-form $\mathbf{p}\in\Omega^1(Q)$, and equation (2) just gives the components/coordinates, $\pmb{p}\in\mathbb{R}^n$, of this 1-form in some basis (right?). But what is this basis, exactly? If we write $\mathbf{p} = \frac{\partial L}{\partial \dot{q}^i} \pmb{\epsilon}^i$ what are the basis 1-forms, $\pmb{\epsilon}^i$? I know the coordinates $q^i$, regarded as functions, give coordinate basis vectors $\pmb{\partial}_i\in\mathfrak{X}(Q)$ and basis 1-forms $\mathbf{d}q^i \in \Omega^1(Q)$. The velocity is then $\mathbf{v}=\dot{q}^i\pmb{\partial}_i$ and I feel like the momenta should be $\mathbf{p}=p_i\mathbf{d}q^i = \frac{\partial L}{\partial \dot{q}^i}\mathbf{d}q^i $. But from equations (1) and (2) it looks like:

$$
\mathbf{p} := \mathbf{d}_2 L(\text{x},\mathbf{v}) \; \overset{?}{=} \; \frac{\partial L}{\partial \dot{q}^i} \mathbf{d} \dot{q}^i.\tag{3}
$$

Is the second equality above correct? If not, why? Perhaps my interpretation of $\mathbf{d}_2 L$ is wrong? If the above is correct, what is the relation between $\mathbf{d}\dot{q}^i$ and $\mathbf{d}q^i$? It would seem $\mathbf{d}\dot{q}^i = \frac{\text{d}}{\text{d}t} \mathbf{d}q^i$ but it doesn't seem that this would form a basis for each $T_{\text{x}}^*Q$, in general.

Best Answer

The answer to your question of whether momentum should be $\frac{\partial L}{\partial \dot{q}^i}\,dq^i$ or $\frac{\partial L}{\partial \dot{q}^i}\,d\dot{q}^i$, is that it is the first option. But, you need to know what exactly you’re talking about, where each object lives, and what you mean by momentum, because “momentum” has got to be one of the highly overused words. See Confusion about role of covectors in behavior of momentum for some remarks here, where I also introduce the notion of “momentum 1-form $\mu_L$” (which is intimately related to the fiber-derivative $\mathbf{F}L$, but is different), and show it is equal to $\frac{\partial L}{\partial \dot{q}^i}\,dq^i$.

There are several things you need to clarify, and since some of them are notational, I’ll try to avoid the common abuses of notation. Also, I’ve written various answers which address each of these issues separately, so you’re going to have to go down a medium-sized rabbit hole and piece things all together. But for now, I’ll just point out some errors and provide necessary links, and as far as possible I’ve tried to organize this answer so you can just read links in order (but see also the links within the links).

1. Prerequisites.

You MUST first learn about Frechet derivatives; less intimidating names include the total derivative/total differential. This is literally a formal way of statinging the foundational idea of differential calculus: a function between vector spaces is said to be differentiable if changes in the function are locally approximable by a linear function. This is so foundational that in my opinion you must grok this before moving on. If you don’t properly understand this, then studying manifolds is kind of like?!?!?!?! i.e it’s like studying algebra before learning about integers in elementary school. Now some links (as we go down the list, they become more “intuitive”, so you can skip at your own discretion, but the first is important).

See Differentiation definition for spaces other than $\Bbb{R}^n$ for the mere definition of Frechet differentiability in full generality of Banach spaces, and a non-trivial example calculation.
See Defining differentiablity of a function of two variables for the definition in the special case of functions $\Bbb{R}^2\to\Bbb{R}$ (it literally makes no simplification compared to the Banach space case so you may as well read the link above first). In this answer, I provide the motivation for the definition.
See also Motivation for the differentiability of a function $f: D \subseteq \Bbb R^m \rightarrow \mathbb R$ for some remarks about how this relates to the 1-dimensional notion of differentiation.
Also, see How does the idea of a differential $dx$ work if derivatives are not fractions?, and Help understanding expression for the derivative of a function, and Differentials as linear maps, and Is the identification of dx as a 1-form, and $\partial/\partial x$ as a vector, arbitrary? for remarks about how the general definition specializes in various situations, and about how to get comfortable with differentiation as the process of considering linear maps.

See also the textbook by Loomis and Sternberg, which I reference in my second link. This is an absolute gem (if you want to do the “introductory” things rigorously and with sufficient generality). Chapters 1,2 provide the necessary linear algebra background, chapter 3 is all about differential calculus, and chapter 9 is about smooth manifolds, and chapter 13 is about classical mechanics (more emphasis on Hamiltonian mechanics) treated in the geometric formalism (it’s just an overview of course).

2. Notational Remarks.

You wrote $\Omega^1(Q)$ throughout, but if anything, it should be $\Omega^1(TQ)$, because everything here is going to be a function/differential form on the tangent bundle $TQ$, NOT the base manifold $Q$ (see the discussion in section 3 below). See Clarification on notation regarding fields, forms, and exterior algebra for a more extensive discussion of notation.
In equation (1), you write $\mathbf{d}_2$ to mean the differential with respect to the second argument, but strictly speaking, this makes no sense. The Lagrangian is a smooth function $L:TQ\to\Bbb{R}$, and the tangent bundle is NOT a cartesian product of two spaces, so there’s no such thing as “differential with respect to second argument”. What the tangent bundle does have, is the structure of a vector bundle, which loosely means a bunch of vector spaces parametrized by a base manifold. What you’re really doing here is trying to calculate the Fiber derivative of the Lagrangian function, $\mathbf{F}L$, which literally means you do the Frechet derivative of $L$ fiber-by-fiber. See Why is the Legendre transform (of vector bundles) a smooth morphism $\mathbf FL:E\to E^*$? for the definition of the fiber-derivative (you first need to know Frechet-derivatives), and some formulae in local coordinates. The reason you may sometimes hear this being referred to as “differential with respect to second argument” is that if you decide to work in a coordinate system adapted to the tangent bundle (or whatever vector bundle you’re working on more generally, and using bundle coordinate charts), then the local representation of the Fiber derivative does indeed look like “differentiation with respect to the vector-arguments (i.e second set of arguments, also known as the fiber-arguments)”. But this is only because bundle-coordinates make the tangent-bundle/vector bundle locally trivial of the form $U\times \Bbb{R}^k$, where $U\subset\Bbb{R}^n$ is open and $\Bbb{R}^k$ is the typical-fiber in coordinates. If you use some wack-a-doodle coordinate chart for the vector bundle (recall, a vector bundle is still a smooth manifold, so it possesses all sorts of coordinate charts, even those which have nothing to do with the vector-bundle structure), you no longer have such a nice product structure, so all meaning is lost to “the second set of vector coordinates”.

I have already given you the link to the general definition of the Fiber-derivative (see the Legendre transform link above), but I’ll repeat once again for your special case:

Given a smooth function $L:TQ\to\Bbb{R}$, its fiber derivative is the map $\mathbf{F}L:TQ\to T^*Q$, defined such that for each tangent vector $\xi_a\in T_aQ$, we define $(\mathbf{F}L)_{\xi_a}\in T_a^*Q$ to be the Frechet derivative of the restriction of $L$ to $T_aQ$, i.e $D(L|_{T_aQ})_{\xi_a}:T_aQ\to\Bbb{R}$, so \begin{align} (\mathbf{F}L)_{\xi_a}&:= D(L|_{T_aQ})_{\xi_a}\in T_a^*Q. \end{align}

Finally, when talking about coordinate charts, one always abuses notation for the base coordinates; see Tautological 1-form on the cotangent bundle is intrinsic using transformation properties for some of these remarks, and also because I’ll reference the tautological 1-form later to relate $\mathbf{F}L$ and $\mu_L$ (see the first link for $\mu_L$‘s definition) in a coordinate-free way. I’ll write it out once in full detail, and without abuse of notation, so you know where the subtleties lie.

Fix a smooth $n$-dimensional manifold $Q$. Let $\pi_{TQ}:TQ\to Q$ and $\pi_{T^*Q}\to Q$ be the tangent and cotangent bundles. We then have the following coordinate charts:

Fix a coordinate chart $(U,x=(x^1,\dots, x^n))$ on $Q$.

This given rise to a coordinate chart $(TU, (q,v)=(q^1,\dots, q^n,v^1,\dots, v^n))$ on the tangent bundle $TQ$ as follows. We define $q^i:= (\pi_{TQ})^*x^i:= x^i\circ \pi_{TQ}$. Often people neglect mentioning the composition with the projection $\pi_{TQ}$, because notationally it is cumbersome, and people know what they’re talking about. Next, we define $v^i:TU\to\Bbb{R}$ such that for each tangent vector $\xi\in TQ$, $v^i(\xi)$ the unique coefficients such that $\xi=\sum_{i=1}^nv^i(\xi)\frac{\partial}{\partial x^i}\big\rvert_{\pi_{TQ}(\xi)}$. This just means $v^i(\xi)= (dx^i)_{\pi_{TQ}(\xi)}(\xi)$. Hence, if $\xi$ belongs to the tangent space $T_aQ$, where $a\in Q$, then its coordinates are \begin{align} \left(q^1(\xi),\dots, q^n(\xi),v^1(\xi),\dots, v^n(\xi)\right)&:=\left(x^1(a),\dots, x^n(a),(dx^1)_a(\xi),\dots, (dx^n)_a(\xi)\right). \end{align}

This given rise to a coordinate chart $(T^*U, (z,p)=(z^1,\dots, z^n,p_1,\dots, p_n))$ on the cotangent bundle $T^*Q$ as follows. We define $z^i:= (\pi_{T^*Q})^*x^i:= x^i\circ \pi_{T^*Q}$, and $p_i:T^*U\to\Bbb{R}$ such that for each covector $\lambda\in T^*Q$, we set $p_i(\lambda):=\lambda\left(\frac{\partial}{\partial x^i}\big\rvert_{\pi_{T^*Q}(\lambda)}\right)$. Hence, if $\lambda$ belongs to the cotangent space $T_a^*Q$, where $a\in Q$, then its coordinates are \begin{align} \left(z^1(\lambda),\dots, z^n(\lambda), p_1(\lambda), \dots, p_n(\lambda)\right)&:=\left(x^1(a),\dots, x^n(a), \lambda\left(\frac{\partial}{\partial x^1}\bigg\rvert_{a}\right),\dots, \lambda\left(\frac{\partial}{\partial x^n}\bigg\rvert_{a}\right)\right). \end{align}

People often abuse notation and suppress the composition with the respective bundle projections. So, often things are written as

Coordinates $(U,q=(q^1,\dots, q^n))$ on $Q$,

Coordinates $(TU,(q,v)=(q^1,\dots, q^n,v^1,\dots, v^n))$ on $TQ$ (or alternatively, they use the notation $(TU, (q,\dot{q})=(q^1,\dots, q^n,\dot{q}^1,\dots, \dot{q}^n))$… which is also fine as long as you keep in mind that $\dot{q}^i$ here is merely a coordinate function),

Coordinates $(T^*U,(q,p)=(q^1,\dots, q^n,p_1,\dots, p_n))$ on $T^*Q$.

The index placement here is purely traditional! Don’t think of these as covectors/vectors etc. These are merely coordinate functions of certain smooth manifolds.

For the sake of this answer, I shall use the more accurate notation, $(U,x), (TU, (q,v)), (T^*U, (z,p))$, distinguishing the base coordinates as well.

3. Explaining/Summarizing the Answer to Your Question.

Ok if you’ve read up to here, you should know what $\mathbf{F}L$, and $\mu_L$ are (note that I am intentionally avoiding the use of the term “momentum” because as mentioned in the beginning, that is extremely overused). Let us now write these objects down in coordinates (you should actually be able to put things together using the various links, but I’ll summarize here)

The fiber-derivative is a map $\mathbf{F}L:TQ\to T^*Q$, and we have that for each $\xi_a\in T_aQ$, the object $(\mathbf{F}L)_{\xi_a}\in T_a^*Q$ is a covector, which in terms of coordinate charts can be written as $(\mathbf{F}L)_{\xi_a}=\frac{\partial L}{\partial v^i}\big\rvert_{\xi_a}\cdot (dx^i)_{a}$. Or, as an equality of maps, \begin{align} (\mathbf{F}L)\bigg|_{TU}&=\frac{\partial L}{\partial v^i}\cdot ((dx^i)\circ \pi_{TQ}). \end{align} Note the order of brackets, and composition. Note also that since $\mathbf{F}L$ takes values in $T^*Q$, we need to use the basis covectors $dx^i$, where $x^i$ are coordinate functions defined on the base manifold (as opposed to $dq^i$).
As mentioned in the first link, we have that $\mu_L=\frac{\partial L}{\partial v^i}\,dq^i$, where now $q^i:TU\to\Bbb{R}$ are the coordinate functions on the tangent bundle (we don’t use $dx^i$ here). Now, $\mu_L$ is correctly a differential 1-form on the tangent bundle $TQ$, i.e $\mu_L\in \Gamma(T^*(TQ))=\Omega^1(TQ)$. As you can see from here, $\mathbf{F}L\neq \mu_L$, because they’re not even the same type of object. Note that $dq^i\neq (dx^i)\circ \pi_{TQ}$
What then is the relationship between the fiber derivative $\mathbf{F}_L$ and the “momentum 1-form” $\mu_L$, in an abstract sense? Well, if we let $\theta$ denote the tautological 1-form on the cotangent bundle, then $\mu_L= (\mathbf{F}L)^*\theta$. You can verify this abstractly, or note that $\theta$ (which already has a well-known coordinate-free definition) can be written in terms of cotangent-bundle coordinates as $\theta=p_i\,dz^i$. So, \begin{align} (\mathbf{F}L)^*(\theta)&=(\mathbf{F}L)^*(p_i\,dz^i)\\ &=((\mathbf{F}L)^*p_i)\cdot d((\mathbf{F}L)^*z^i)\\ &=(p_i\circ \mathbf{F}L)\cdot d(z^i\circ \mathbf{F}L)\\ &=\frac{\partial L}{\partial v^i}\,dq^i\\ &=\mu_L. \end{align} If you want, you can take this equality as a definition for $\mu_L$, or you can start with the more primitive definition (as I did in the link) and prove this equality as a theorem.

4. Summary

Frechet derivative: given Banach spaces $V,W$ (think finite-dimensional if you wish, but definition-wise, nothing is simplified if you do so), an open set $\Omega\subset V$, it is a notion of differentiation for maps $f:\Omega\to W$. The key idea is local linear approximation around a given point in $\Omega$.
Fiber derivative: given smooth vector bundles $X,Y$ over a fixed base manifold $Q$ (for Lagrangians, these are $TQ,T^*Q$) it is a notion of differentiation along the directions of the fibers, i.e in the vectorial directions of the manifolds, for fiber-preserving maps $f:X\to Y$. This gives a smooth fiber-preserving map $\mathbf{F}f:X\to\text{Hom}(X,Y)$, obtained by taking Frechet derivatives fiber-by-fiber.
We have coordinates $(U,x)$ on $Q$, $(TU,(q,v))$ on $TQ$, and $(T^*U,(z,p))$ on $T^*Q$.
Given $L:TQ\to\Bbb{R}$, we have $\mathbf{F}L=\frac{\partial L}{\partial v^i}\,((dx^i)\circ \pi_{TQ})$, and $\mu_L= (\mathbf{F}L)^*(\theta)= (\mathbf{F}L)^*(p_i\,dz^i)= \frac{\partial L}{\partial v^i}\,dq^i$.

Hopefully this clarifies what the notation stands for, how the various notions are related, when to use which basis covectors, and what is defined where.

Best Answer

Related Solutions

Classical Mechanics – Comparing Generalized vs Conjugate Momenta in Lagrangian Formalism

Classical Mechanics – Relation Between Method of Moving Frames, Spin Connection, and Rotational Kinematics

Related Question