[Math] Pontryagin’s maximum principle

calculus-of-variationsdynamic programmingoptimal control

So I've been doing some optimal control theory lately. It's really interesting but I've spent the whole day trying to wrap my head around pontryagin's maximum principle. There's a lot of mathematical derivations out there but I just can't seem to find an intuitive explanation of why it's a necessary condition, what the adjoint variable is, etc. Please explain like I'm five (Or like I'm an economist).

Thanks in advance

Best Answer

I will try a very simplistic answer (I hope the more mathematically rigorous don't take offense). The way I always have seen it is that essentially the information in the maximum or minimum principle is that if you are at an extremum, then any deviation from it causes variations in the second order and not the first.

That is a very simplistic answer that glosses over a lot of mathematical details but that is the intuition and hence its relation to the Calculus of Variations, Lagrangians and Hamiltonians. Now when there are constraints involved, just like in Calculus of Variations we apply a version of the method of Lagrange multipliers (it can also be viewed as Legendre transform of the Lagrangian) and then $\lambda$ is defined to be the co-state (as mentioned above).

Mathematically (and crudely) speaking, if your state $x$ evolves on a manifold $M$, then your system $ \dot x = f(x)$ defines a flow on the tangent bundle $TM$ and the optimal control defines via the costate a flow on the cotangent bundle, $T^*M$.

You can find an excellent Feynman Lecture to develop intuition on the topic.

EDIT: There are deeper connections here, which if you are me like might not be obvious to you (I was taught four years of Mechanical Engineering without ever coming across Lagrangian or Hamiltonian mechanics). In classical mechanics we use Newtons Laws, primarily $ F = ma $ to solve most problems. But this tends to be a coordinate dependent method, and Lagrange in his pursuit of a coordinate free treatment of mechanics invented Lagrangian mechanics where in homogenous coordinates, $q$, Newton's second law became the condition that Euler-Lagrange equations must be satisfied.

$$ \frac{d}{{dt}}\left( {\frac{{\partial L}}{{\partial {{\dot q}_j}}}} \right) = \frac{{\partial L}}{{\partial {q_j}}} $$

Note that the Euler-Lagrange equations can also be derived from a calculus of variations perspective. Here the Lagrangian $L=T-V$ is defined as the kinetic energy less the potential energy (in the optimal control setting it tends to be the running cost function).

Hamilton realized that there was a variational principle involved here and defined the Hamiltonian $H$ to be the Legendre transform of the Lagrangian which by its definition introduced the co-state or adjoint variable and then the classical laws became that Hamilton's equations be satisfied.

$$ \frac{{\partial H}}{{\partial {q_j}}} = - {\dot p_j},\quad \frac{{\partial H}}{{\partial {p_j}}} = {\dot q_j},\quad \frac{{\partial H}}{{\partial t}} = - \frac{{\partial L}}{{\partial t}} $$

The variational principle was that the correct dynamics minimizes the action integral defined as $ S = \int_{{t_1}}^{{t_2}} {Ldt} $ and satisfies the above equations. So in the optimal control setting when we form the Hamiltonian and set up the co-state equation, we are in essence following this "Principle of Least Action" where the Lagrangian is now our cost function, and the Hamiltonian can be thought of as a Langrange multiplier that enforces the condition that the state adheres to the system dynamics.

Related Question