[Math] Deriving the HJB equation for exponential utility

oc.optimization-and-controlstochastic-calculusstochastic-differential-equationsstochastic-processes

I would like to derive the HJB equation for the following stochastic optimal control problem:

$ \Phi(t,x)=\sup_{h} E \left[\exp \left\{\gamma \int_t^T g(X_s,h(s);\gamma)\ ds \right\} \right]$

where the state variable $X(t)$ is given

$dX(t)=\mu(X(t),h(t),t))\ dt + \sigma(X(t),h(t),t))\ d W(t) $

$g$ is a known function and $\gamma$ is just a constant different from $0$.

Honestly I never studied this topic formally and doing some research on the internet I read that the objective functional is usually written as an expectation of an integral over time. Splitting the integral in two, the Bellman principle can be readily applied and this yields the HJB equation.

In this case, however, we have the expectation of an exponential of an integral and quite frankly I am really not sure how to proceed.

Could you please help me? Thanks a lot 🙂

P.S. I will try to be more concrete: Let us consider the following problem

$ J(t_0,x_0)=\max_{u} E \left[ \int_{t_0}^T f(t,x,u)\ dt \right]$

subject to

$d x(t)=g(t,x(t),u(t))\ dt + \sigma (t,x(t),u(t))\ d B_t$.

Exploiting the linearity of the integral, we can write

$ J(t_0,x_0)=\max_{u} E \left( \int_{t_0}^{t_0+\Delta t} f(t,x,u)\ dt + \int_{t_0+\Delta t}^T f(t,x,u)\ dt \right)$.

Using the Bellman's principle and the law of iterated expectations, we can write

$J(t_0,x_0)=\max_{u, t_0 \leq t_0+\Delta} E \left[f(t,x,u) \Delta + J(t_0+\Delta,x_0+\Delta x) \right]$.

Applying Ito's lemma and taking the limit $\Delta t \to 0$, we derive the following HJB equation:

$-\frac{J (t,x)}{\partial t} = \max_{u} \left(f(t,x,u)+g(t,x,u) \frac{\partial J(t,x)}{\partial x} + \frac{1}{2} \sigma(t,x,u)^2 \frac{\partial^2 J}{\partial x^2}\right)$.

My question is the following: In order to derive the HJB equation for the exponential case, could we apply a similar reasoning? Since the exponential is not additive, it is not clear to me how to do it.

I found a paper where the author solves an exponential control problem claiming (but not proving) that the HJB equation is (up to a minus or plus sign)

$ \frac{\partial J}{\partial t} + sup (\frac{1}{2} \sigma^2 \nabla^2 J + \mu \nabla J -\frac{\gamma}{2} \sigma^2 (\nabla v)^2 – g)=0$.

Can someone explain to me where this equation comes from? As you can see, it has some terms in common with the "standard" one, but there is also some "new" term.

Best Answer

This class of problems is called Risk Sensitive Optimal Control. The case with $\gamma > 0$ is called the risk averse case, and the case with $\gamma < 0$ is called risk seeking case, for intuitive reasons. The earliest instance of this problem was solved by Jacobson in 1973 where the integrand was a quadratic function of state $X(t)$ and control $h(t)$, and the dynamics was a linear stochastic differential equation, i.e., $\mu(X(t),h(t),t) = A X(t) + B h(t)$, $\sigma$ constant, and $W(t)$ as Gaussian white noise. The resulting formulation was called Linear Quadratic Exponential Gaussian (LEQG) optimal control problem, as it was a generalization of well known LQG problem (in the limit $\gamma \rightarrow 0$).

An intuitive way to think of the cost functional is to mentally expand the exponential in Taylor series and think how the sign of $\gamma$ affects the cost-to-go. You can find lot of control-theoretic literature online on this well studied problem, mostly for cases when something is known about the structure of $g$, $\mu$, and $\sigma$.

Related Question