Stochastic Processes – Does Entropy of SDE with Nondegenerate Noise Always Increase?

entropystochastic-calculusstochastic-differential-equationsstochastic-processes

Let $W$ be a standard Brownian motion, and let $X$ be the solution to the one dimensional SDE

$$dX_t = \sigma(t, X_t) \, dW_t$$

with initial condition $X_0 = x_0$ a.s. for some $x_0 \in \mathbb R$. We assume $\sigma$ is Lipschitz continuous and uniformly bounded away from $0$.

Suppose that $X_t$ admits a density $f_t$ for all $t > 0$.

Question: Is it true that we have the entropy inequality

$$\mathbb E[-\log f_t (X_t)] > \mathbb E[-\log f_s (X_s)]$$

for all $t > s$ in $\mathbb R_+$?

Best Answer

$\newcommand{\si}{\sigma}\newcommand{\R}{\mathbb R}\newcommand{\pa}{\partial}$The answer is no. The idea is to get a diffusion version of my two-state Markov chain example.

Indeed, for $t\in(0,\infty)$ and real $x$, let \begin{equation*} b(x,t):=e^{(t+1)^2 x^2/(2 t)}+\frac{1-t}{2 (t+1)^3}\ge1+\frac{1-t}{2 (t+1)^3}\ge\frac{53}{54}>\frac12, \tag{-1}\label{-1} \end{equation*} so that \begin{equation*} \si(x,t):=\sqrt{2b(x,t)}\ge1. \end{equation*} Moreover, letting \begin{equation*} f_t(x):=f(x,t):=g_{0,t/(t+1)^2}(x), \tag{0}\label{0} \end{equation*} where $g_{a,s^2}$ is the density of the normal distribution with mean $a$ and variance $s^2$, we see that $f$ is a solution of the Fokker–Planck equation \begin{equation*} \pa_t f(x,t)=\pa_x^2(b(x,t)f(x,t)). \tag{1}\label{1} \end{equation*} So, $f_t$ is the density of $X_t$ given the SDE \begin{equation*} dX_t=\si(X_t,t)\,dW_t \end{equation*} with the initial condition $X_0=0$ (since the $EX_t^2=t/(t+1)^2\to0$ as $t\downarrow0$).

However, the entropy \begin{equation*} \int_\R f_t\ln\frac1{f_t}=\frac{1}{2} (\ln (2 \pi t)-2 \ln (t+1)+1) \end{equation*} decreases in $t\ge1$. $\quad\Box$


Discussion: The example above may seem counterintuitive. Indeed, if $\si(x,t)$ does not depend on $x$, then $X_t$ will be normally distributed for each $t$ with variance increasing with $t$, and hence with the entropy increasing with $t$.

In our example, $X_t$ is still normally distributed for each $t$, but the variance $t/(t+1)^2$ of $X_t$ is decreasing in $t\ge1$. As noted in the last paragraph, this can only happen if the diffusion coefficient $b(x,t)=\si(x,t)^2/2$ depends on $x$.

We wanted the variance of $X_t$ to be decreasing in (say) $t\ge1$. It may then seem counterintuitive that the diffusion coefficient $b(x,t)$ in our example increases very fast in $|x|$, especially for large $t$. The Fokker–Planck equation \eqref{1} may help shed some light here. Indeed, suppose first that we are looking for a solution $f$ of \eqref{1} stationary in $t$. Then \eqref{1} implies that $b(x,t)f(x,t)$ is affine in $x$. If $b(x,t)$ and $f(x,t)$ are also even in $x$, then $b(x,t)f(x,t)$ must be constant in $x$. So then, if $f_t$ is a normal density and hence $f(x,t)$ is decreasing fast in $|x|$, then $b(x,t)$ must be increasing fast in $|x|$. If now the variance of $X_t$ is decreasing somewhat slowly for large $t$, then we may expect that $b(x,t)$ must still be increasing fast in $|x|$, as is the case in our example.

Actually, the way $b(x,t)$ was found in our example is as follows. We want \eqref{0} to hold. With $f$ so prescribed, for each $t$ equation \eqref{1} becomes a simple ODE (with respect to $x$) for the function $b_t:=b(\cdot,t)$. Thus we get the expression for $b(x,t)$ in \eqref{-1}, with a certain choice of the integration constants.