Numerical solution of HJB (Hamilton-Jacobi-Bellman equations) in practice

control theoryhamilton-jacobi-equationoptimal control

I am struggling to understand what the use of HJB equations in order to solve optimal control problems is in practice. I found the following approach in the book "Stochastic Controls" by Jiongmin Yong and Xun Yu Zhou on p.163:

1.) Solve the HJB-equation to find the value function,

2.) Minimize/Maximize the Hamiltonian pointwise to find the optimal control,

3.) Solve the ODE, since the optimal control is known.

While I do understand the above, I don't understand why apart from certain specific cases where the Hamiltonian is explicitly known, solving the HJB should be more efficient than simply minimizing the cost functional with dynamic programming or some type of gradient descent.

As far as I understand it, in order to solve the HJB-equation numerically, I have to discretize all equations in time and space and then have to solve a minimization problem. Solving this minimization problem again as according to step 2 seems completely redundant.

Am I missing something? Why are HJB equations useful in practice?

Best Answer

When you directly minimize the cost functional (for instance using a discretization in time so that the integral becomes a summation), you are solving the optimal control problem for a single initial point $x_0$. Moreover, you will find an open-loop control, i.e. a function $t \mapsto u^*(t)$, which is not robust to perturbations in the dynamics (see Open-loop vs closed-loop).

When you solve the HJB equation, instead, you are simultaneously solving the problem for all values of $x_0$, and you can even provide an optimal control in feedback form.

Regarding the question on the double minimization problem, you are right. There is a double minimization problem, but

  1. The dimension of the minimization problem is the dimension of the control space, which is typically very small, so it not very expensive.
  2. Simulations have shown that the first minimization, the one while solving the HJB in step 1, does not need to be solved accurately, so one can discretize the control space with few controls and minimize over a finite set of possible control values. The second minimization problem, however, must be solved more accurately.

I recommend this book for more details.

Related Question