Lagrangian Formalism – Equivalence Between Hamiltonian and Lagrangian Mechanics

calculusdifferentiationhamiltonian-formalismlagrangian-formalism

I'm reading a proof about Lagrangian => Hamiltonian and one part of it just doesn't make sense to me.

The Lagrangian is written $L(q, \dot q, t)$, and is convex in $\dot q$, and then the Hamiltonian is defined via the Legendre transform
$$H(p,q,t) = \max_{\dot q} [p \cdot \dot q – L(q, \dot q, t)]$$

Under the right conditions there exists a function $\dot Q (p,q,t)$ such that
$$H(p,q,t) = p \cdot \dot Q(p,q,t) – L(q, \dot Q(p,q,t), t)$$
i.e. when some $\dot Q(p,q,t)$ satisfies $$p = \frac{\partial L}{\partial \dot q}\rvert_{(q, \dot Q(p,q,t), t)} = \frac{\partial L}{\partial \dot q}(q, \dot Q(p,q,t), t).$$
(Finding this function is usually called "inverting $p$".)

By taking partials in the $p$ variable and using the relationship, we can obtain the relationship
$$\dot Q = \frac{\partial H}{\partial p}.$$

Because of the notation I chose, I get the strong urge to say $$\dot q = \frac{\partial H}{\partial p} ,$$ and in fact this is what the textbook does. But have we proved this?

In other words, how can we deduce that
$$q'(t) = \frac{\partial H}{\partial p}(p(t), q'(t), t)$$
for any differentiable vector valued function $q$? (or maybe there are more conditions we need on $q$? Here
$$p(t) = \frac{\partial L}{\partial \dot q}(q(t), q'(t), t)$$
according to Lagrange's equations.

Best Answer

Ok, let us start from scratch. A function $g: \mathbb R^n \to \mathbb R$ with $f \in C^2(\mathbb R^n)$ is said to be convex if its Hessian matrix (i.e. the one with coefficients $\partial^2 f/\partial x_i \partial x_j$) is everywhere (strictly) positively defined.

Let $\Omega \subset \mathbb R \times \mathbb R^n$ be an open set and focus on a jointly $C^2$ Lagrangian function $\Omega \times \mathbb R^n \ni (t,q,\dot{q}) \mapsto L(t, q, \dot{q}) \in \mathbb R$.

For fixed $(t,q) \in \Omega$, $L$ is assumed to be convex as a function of $\dot{q}$. In other words $\mathbb R^n \ni \dot{q} \mapsto L(t, q, \dot{q}) \in \mathbb R$ is supposed to be convex.

Referring to either systems made of material points or solid bodies, convexity arises form the structure of the kinetic energy part of Lagrangians, which are always of the form $T(t, q, \dot{q}) - V(t, q)$, even considering generalized potentials $V(t,q, \dot{q})$ with linear dependence on $\dot{q}$, as is the case for inertial or electromagnetic forces or inertial forces also in the presence of holonomic ideal constraints.

The associated Hamiltonian function is defined as the Legendre transformation of $L$ with respect to the variables $\dot{q}$. In other words:

$$H(t,q,p) := \max_{\dot{q} \in \mathbb R^n}\left[p\cdot \dot{q} - L(t, q, \dot{q})\right]\qquad (1)$$

Within our hypotheses on $L$, from the general theory of Legendre transformation, it arises that, for fixed $(t,q) \in \Omega$, a given $p \in \mathbb R^n$ is associated with exactly one $\dot{q}(p)_{t,q} \in \mathbb R^n$ where the maximum of the RHS in (1) is attained (for $n=1$ the proof is quite evident, it is not for $n>1$).

Since $\dot{q}(p)_{t,q} $ trivially belongs to the interior of the domain of the function $\mathbb R^n \ni \dot{q} \mapsto p\cdot \dot{q} - L(t, q, \dot{q})$, it must be:

$$\left.\nabla_{\dot{q}} \right|_{\dot{q}= \dot{q}(p)_{t,q}} \left( p\cdot \dot{q} - L(t, q, \dot{q})\right) =0\:.$$ In other words (always for fixed $t,q$): $$p = \left.\nabla_{\dot{q}} \right|_{\dot{q}(p)_{t,q}} L(t, q, \dot{q})\:, \quad \forall \dot{q} \in \mathbb R^n\qquad (2)$$

As a consequence, (always for fixed $(t,q)\in \Omega$) the map $\mathbb R^n \ni p \mapsto \dot{q}(p)_{t,q} \in \mathbb R^n$ is injective, because it admits a right inverse given by the map $\mathbb R^n \ni \dot{q} \mapsto \nabla_{\dot{q}} L(t, q, \dot{q})$ which, in turn, is surjective. However the latter map is also injective, as one easily proves using the convexity condition and the fact that the domain $\mathbb R^n$ is trivially convex too. The fact that the $\dot{q}$-Hessian matrix of $L$ is non-singular also implies that the map (2) is $C^1$ with its inverse.

Summing up, the map (2) is a $C^1$ diffeomorphism from $\mathbb R^n$ onto $\mathbb R^n$ and, from (1), we have the popular identity describing the interplay of the Hamiltonian and Lagrangian functions as:

$$H(t,q,p) = p\cdot \dot{q} - L(t, q, \dot{q})\qquad (3)$$

which holds true when $p \in \mathbb R^n$ and $\dot{q} \in \mathbb R^n$ are related by means of the $C^1$ diffeomorphism from $\mathbb R^n$ onto $\mathbb R^n$ (for fixed $(t,q)\in \Omega$): $$p = \nabla_{\dot{q}} L(t, q, \dot{q})\:, \quad \forall \dot{q} \in \mathbb R^n\qquad (4)\:.$$

By construction, $H= H(t,q,p)$ is a jointly $C^1$ function defined on $\Gamma := \Omega \times \mathbb R^n$. I stress that $L$ is defined on the same domain $\Gamma$ in $\mathbb R^{2n+1}$. The open set $\Gamma$ is equipped by the diffeomorphism: $$\psi: \Gamma \ni (t,q, \dot{q}) \mapsto (t,q, p) \in \Gamma \qquad (4)'$$ where (4) holds.

Let us study the relationship between the various derivatives of $H$ and $L$.

I remark that I will not make use of Euler-Lagrange or Hamilton equations anywhere in the following.

Consider a $C^1$ curve $\gamma: (a,b) \ni t \mapsto (t, q(t), \dot{q}(t)) \in \Gamma$, where $t$ has no particular meaning and $\dot{q}(t)\neq \frac{dq}{dt}$ generally. The diffeomorphism $\psi$ transform that curve into a similar $C^1$ curve $t \mapsto \psi(\gamma(t)) = \gamma'(t)$ I will also indicate by $\gamma': (a,b) \ni t \mapsto (t, q(t), p(t)) \in \Gamma$.

We can now evaluate $H$ over $\gamma'$ and $L$ over $\gamma$ and compute the total temporal derivative taking (3) and (4) into account, i.e. we compute:

$$\frac{d}{dt} H(t, q(t),p(t)) = \frac{d}{dt}\left(p(t) \dot{q}(t) - L(t,q(t),p(t)) \right)\:.$$

Computations gives rise almost immediately to the identity, where both sides are evaluated on the respective curve:

$$\frac{\partial H}{\partial t} + \frac{dq}{dt}\cdot \nabla_q H + \frac{dp}{dt}\cdot \nabla_p H = \frac{dp}{dt}\dot{q} + p \frac{d\dot{q}}{dt} -\frac{\partial L}{\partial t} - \frac{dq}{dt}\cdot \nabla_q L - \frac{d\dot{q}}{dt}\cdot \nabla_{\dot{q}} L \:.$$ In the RHS, the second and the last term cancel each other in view of (4), so that: $$\frac{\partial H}{\partial t} + \frac{dq}{dt}\cdot \nabla_q H + \frac{dp}{dt}\cdot \nabla_p H = \frac{dp}{dt}\dot{q} -\frac{\partial L}{\partial t} - \frac{dq}{dt}\cdot \nabla_q L \:.$$ Rearranging the various terms into a more useful structure: $$\left(\frac{\partial H}{\partial t}|_{\gamma'(t)} + \frac{\partial L}{\partial t}|_{\gamma(t)}\right) + \frac{dq}{dt}\cdot \left( \nabla_q H|_{\gamma'(t)} + \nabla_q L|_{\gamma(t)}\right) + \frac{dp}{dt}\cdot \left(\nabla_p H|_{\gamma'(t)} - \dot{q}|_{\gamma(t)}\right) =0\:.\qquad (5)$$

Now observe that actually, since $\gamma$ is generic, $\gamma(t)$ and $\gamma'(t)= \psi(\gamma(t))$ are generic points in $\Gamma$ (however connected by the transformation (4)). Moreover, given the point $(t,q, \dot{q}) = \gamma(t) \in \Gamma$, we are free to choose the derivatives $\frac{dq}{dt}$ and (using the diffeomorphism) $\frac{dp}{dt}$ as we want, fixing $\gamma$ suitably. If we fix to zero all these derivatives, (5) proves that, if $(t,q, \dot{q})$ and $(t,q,p)$ are related by means of (4):

$$\left(\frac{\partial H}{\partial t}|_{(t,q,p)} + \frac{\partial L}{\partial t}|_{(t,q, \dot{q})}\right) =0\:.$$

This result does not depend on derivatives $dq/dt$ and $dp/dt$ since they do not appear as arguments of the involved functions. So this result holds everywhere in $\Gamma$ because $(t,q, \dot{q})$ is a generic point therein. We conclude that (5) can be re-written as:

$$\frac{dq}{dt}\cdot \left( \nabla_q H|_{\gamma'(t)} + \nabla_q L|_{\gamma(t)}\right) + \frac{dp}{dt}\cdot \left(\nabla_p H|_{\gamma'(t)} - \dot{q}|_{\gamma(t)}\right) =0\:.\qquad (5)'$$

where again, we are considering a generic curve $\gamma$ as before. Fixing such curve such that all components of $\frac{dq}{dt}$ and $\frac{dp}{dt}$ vanish except for one of them, for instance $\frac{dq^1}{dt}$, we find:

$$\left(\frac{\partial H}{\partial q^1}|_{(t,q,p)} + \frac{\partial L}{\partial q^1}|_{(t,q, \dot{q})}\right) =0\:,$$

if $(t,q, \dot{q})$ and $(t,q,p)$ are related by means of (4), and so on.

Eventually we end up with the following identities, valid when $(t,q, \dot{q})$ and $(t,q,p)$ are related by means of (4)

$$\frac{\partial H}{\partial t}|_{(t,q,p)} =- \frac{\partial L}{\partial t}|_{(t,q, \dot{q})}\:, \quad \frac{\partial H}{\partial q^k}|_{(t,q,p)} =- \frac{\partial L}{\partial q^k}|_{(t,q, \dot{q})}\:, \quad \frac{\partial H}{\partial p_k}|_{(t,q,p)} = \dot{q}^k\:. \quad (6)$$ The last identity is the one you asked for. As you see, the found identities rely upon the Legendre transformation only and they do not consider Euler-Lagrangian equations or Hamilton ones.

However, exploiting these identities, it immediately arises that $\gamma$ verifies EL equations: $$\frac{d}{dt} \frac{\partial L}{\partial \dot{q}^k} - \frac{\partial L}{\partial q^k}=0\:,\quad \frac{dq^k}{dt} = \dot{q}^k\quad k=1,\ldots, n$$ if and only if the transformed curve $\gamma'(t) := \psi(\gamma(t))$ verifies Hamilton equations. $$\frac{d p_k}{dt} = -\frac{\partial H}{\partial q^k} \:, \quad \frac{dq^k}{dt} = \frac{\partial H}{\partial p_k}\quad k=1,\ldots, n\:.$$

Indeed, starting from a curve $\gamma(t) = (t, q(t), \dot{q}(t))$, the first EL equation, exploiting (4) (which is part of the definition of $\psi$) and the second identity in (6), becomes the first Hamilton equation for the transformed curve $\psi (\gamma(t))$. Moreover, the second EL equation, making use of the last identity in (6), becomes the second Hamilton equation for the transformed curve. This procedure is trivially reversible, so that, starting from Hamilton equations, you can go back to EL equations.

The first identity in (6) it not used here. However it implies that the system is or is not invariant under time translations simultaneously in Lagrangian and Hamiltonian formulation (in both cases, that invariance property implies the existence of a constant of motion which is nothing but $H$ represented with the corresponding variables either Lagrangian or Hamiltonian).

As a final comment notice that (3) and the last identity in (6) (which is nothing but the inverse function of (2) at fixed $(t,q)$) imply $$L(t, q, \dot{q}) = \nabla_p H(t,q,p) \cdot p - H(t,q,p)\:,$$ where (2) is assumed to connect Lagrangian and Hamiltonian variables.