[Math] How to derive the transversality condition of a free-endpoint optimal control problem

calculus-of-variationsoptimal controltransversality

I've been studying the calculus of variations and optimal control theory only for fixed-endpoint problems now.

Liberzon has a somewhat complex explanation of transversality conditions, that I don't really understand.

Can someone give a quick explanation of how to derive a transversality condition for a standard optimal control problem of maximizing a functional with a free-endpoint? (and why that derivation works)

$$J(u)=\int_a^{x_f}L(x,y(x),u(x))dx,$$ $$\dot y(x)=f(x,y(x),u(x))$$

ps. I will practice this myself on specific problems once I understand the general approach.

Best Answer

I learned optimal control theory from an econ class, and hope my explanation would help.

short version for transversality conditions:

if end time $t_1$ is fixed but the value is free, then the co-state variable satisfies $\lambda (t_1)=0$, otherwise the shadow price of $y(t_1)$ is not zero, and we can increase or decrease it negatively with the direction designated by the sign of $\lambda (t_1)$.

if both are free, then optimal end time $t_1^*$ should give us zero (optimized) Hamiltonian at $t_1^*$: $\mathcal H(t_1^*, y^*(t_1^*),u^*(t_1^*),\lambda (t_1^*))=0$, otherwise the integrand is not zero at that time, and we can push forward or back the optimal end time positively with direction designated by the sign of $\mathcal H(t_1^*, y^*(t_1^*),u^*(t_1^*),\lambda (t_1^*))$.


long version:

first, let's change some notations: we have an optimization problem involving time $t$ where we can choose state variable $u(t)$ to influence $y(t)$:

$$ \max \int ^{t_{1}} _{t_{0}} L(t,y(t),u(t)dt $$ subject to $$ \dot y(t)= f(t, y(t),u(t)) $$ and initial condition $y(t_0)=y_0$ given. there shall be some conditions to be satisfied for function $L,g$ and the control $u$, otherwise we cannot talk about the maximum principle, but I simply omit them here.

to understand the transversality condition for variable endpoint problem, it's better to start from a constrained one: $y(t_1)= y_1$.

Then write the Hamiltonian as $$ \mathcal H(t,y,u,\lambda)=L(t,y,u)+\lambda g(t,y,u) $$

and the celebrated maximum principle states that

if $(y^*(t), u^*(t))$ is an optimal pair, thenthere exists an associated function $\lambda (t)$ that, for all $t\in [t_0, t_1]$:

1) for all $u$: $$ \mathcal H(t,y^*(t),u(t),\lambda(t))\leq \mathcal H(t,y^*(t),u^*(t),\lambda(t)) $$ 2) and $\lambda$ satisfies

$$\dot \lambda (t)=\mathcal H'_y(t,y^*(t),u^*(t),\lambda(t))$$

--p315, Sydsæter, K., Hammond, P., & Seierstad, A. (2008). Further mathematics for economic analysis. Pearson education.

given optimal pair $(y^*(t),u^*(t))$, define the optimized value function as: $$ V(y_0,y_1,t_0,t_1)=\int ^{t_{1}} _{t_{0}} L(t,y^*(t),u^*(t)dt$$ and optimized Hamiltonian for time $t$: $$\mathcal H^*(t)=\mathcal H(t, y^*(t), u^*(t),\lambda (t))$$

analog to static optimization, we can also derive sensitivity results provided differentiability (which I'd call "envelop theorem"):

$$\dfrac {\partial V}{\partial y_0}=\lambda (t_0), \dfrac {\partial V}{\partial y_1}=-\lambda (t_1), \dfrac {\partial V}{\partial t_0}=-\mathcal H^*(t_0), \dfrac {\partial V}{\partial t_1}=\mathcal H^*(t_0),$$

for fixed end time free value, the second one must be zero; for both free, the fourth also should be zero.

Related Question