Why are the costate equations solved backwards in time

control theorydynamic programmingeuler's methodoptimal controloptimization

I'm trying to find an optimal control to a simple nonlinear SIR model. I am trying to undersand the Pontryagin minimum principle but I don't understand why the costate equations must be solved backwards in time?

Another question must be why when using Euler method to find the optimal control
$u^*$, we implement $u^*$ forward in time and the costate variables (adjoint variables) are implemented backwards in time while $u^*$ is expressed in function of them?

For example, if a $$u^*(t) = \left( p_1^*(t) – p_2^*(t) \right) x^*(t)$$ is the optimal control for an arbitrary system where $p1^*$ and $p2^*$ are the optimal costate variable and $x^*(t)$ is an optimal state variable. The implementation would look like:

u(i+1) = (p1(N-i) - p2(N-i))*x_star(i+1);

Best Answer

Concerning the theoretical part, I wouldn't say that the costate equations must be solved backwards. To take a step back, in the usual setting for the Lagrange problem, $x(0)$ is fixed and $x(T)$ is free. In this setting, for the costate, $p(0)$ is free and $p(T) = 0$. But for slightly different settings where part of the initial state $x(0)$ and part of the final state $x(T)$ are constrained (known) and others free (unknown), you would have complementary conditions on $p(0)$ and $p(T)$. This comes from the proof of the maximal principle and the "dual" nature of $p$.

At the heuristic level, you can somehow understand it. In Pontryagin's maximum principle, the costate variable is here precisely to help you transform a local-in-time maximization criterion into something that will turn out to be optimal for the whole time domain. In some sense, it thus embeds information about the future, i.e. $p(t)$ should contain information about what will happen on $[t,T]$.

Now for the numerical simulation,

I would need a little more details on your scheme. As I understand it, you probably have an iterative method where you solve for $u$ forward, then $p$ backwards, then $u$ forward ... and so on until you converge?
But here again, it is not necessary to solve for $p$ backwards. In fact, a very classical method is the shooting method where you guess an initial condition $p_0$ for $p(0)$, then solve both for $u$ and $p$ forward, and see where you end up. Now if your choice of $p_0$ does not ensure the desired condition at the final time (i.e. $p(T) = 0$ or $x(T) = 0$ or whatever you wish to reach), you change $p_0$ and start again. For example, if you want to ensure $p(T) = 0$, you can think of $p(T)$ as a function $F(p_0)$ and you are looking for $p_0$ such that $F(p_0) = 0$, so you can apply any method you wish to find such a zero of $F$, for example a Newton scheme.

Best Answer

Related Solutions

[Math] Optimal Control – difference between indirect/direct approaches

[Math] Pontryagin’s maximum principle

Related Question