[Math] Do convex and decreasing functions preserve the semimartingale property

pr.probabilityreal-analysisstochastic-calculus

Some time ago I spent a lot of effort trying to show that the semimartingale property is preserved by certain functions. Specifically, that a convex function of a semimartingale and decreasing function of time is itself a semimartingale. This was needed for a result which I was trying to prove (more details below) and eventually managed to work around this issue, but it was not easy. For twice continuously differentiable functions this is an immediate consequence of Ito's lemma, but this cannot be applied in the general case. After failing at this task, I also spent a considerable amount of time trying to construct a counterexample, also with no success. So, my question is as follows.

1) Let $f\colon\mathbb{R}^+\times\mathbb{R}\to\mathbb{R}$ be such that $f(t,x)$ is convex in $x$ and continuous and decreasing in $t$. Then, for any semimartingale $X$, is $f(t,X_t)$ necessarily a semimartingale?

Actually, it can be assumed here that $X$ is both continuous and a martingale which, with some work, would imply the general case.
As it turns out, this can be phrased purely as a real-analysis question.

2) Let $f\colon\mathbb{R}^+\times\mathbb{R}\to\mathbb{R}$ be such that $f(t,x)$ is convex in $x$ and decreasing in $t$. Can we write $f=g-h$ where $g(t,x)$ and $h(t,x)$ are both convex in $x$ and increasing in $t$?

Stated like this, maybe someone with a good knowledge of convex functions would be able to answer the question one way or the other.

For $f(t,x)$ convex in $x$ and increasing in $t$ then approximation by smooth functions and applying Ito's lemma allows us to express $f(t,X_t)$ as the sum of a stochastic integral and an increasing process
$$
f(t,X_t)=\int_0^t\frac{\partial f}{\partial x}(s,X_s)\,dX_s+V_t,\qquad{\rm(*)}
$$
so it is a semimartingale. If, instead, $f$ is decreasing in $t$, then an affirmative answer to question 2 will reduce it to the easier case where it is increasing in $t$, also giving a positive answer to the first question.
Explaining why question 1 implies 2 is a bit trickier. If 2 was false, then it would be possible to construct martingales $X$ such that the decomposition (*) holds where the variation of $V$ explodes at some positive time.

This problem arose while I was trying to prove the following: is a continuous martingale uniquely determined by its one dimensional marginals? For arbitrary continuous martingales this is false, but is known to be true for diffusions $dX_t=\sigma(t,X_t)\,dW$ for Brownian motion $W$ and smooth parameter $\sigma$. The idea is to back out $\sigma$ from the Kolmogorov forward equation. This is well-known in finance as the local volatility model. However, I was trying to show rather more than this. All continuous and strong Markov martingales are uniquely determined by the one dimensional marginals. I was able to prove this, and the relation between the marginals and joint distributions of the martingale has many nice properties (I wrote a paper on this, submitted to the arXiv, but not published as I am still working on changes asked for by the referees). The method was to reformulate the Kolmogorov backward equation in terms of the marginals. This does use Ito's lemma, requiring twice differentiability, but can be circumvented with a bit of integration by parts as long as $f(t,X_t)$ is a semimartingale for the kinds of functions mentioned above. The question above arose from trying, and failing, to prove this. Without an answer to this question, the problem becomes much much harder, as many of the techniques from stochastic calculus can no longer be applied (and, approximating by semimartingales didn't seem to help either). The work around was very involved, part of which I published as a standalone paper here and the rest forms most of a paper I submitted to the arXiv here. Adding together those papers, it comes to maybe 50 pages of maths and a lot of effort to work around the question above.


Update: I posted more details on the various equivalent forms of this question here. I also posted a possible counterexample, without proof but with some numerical evidence suggesting that it may indeed be a counterexample.


Best Answer

I still have no idea what the answer to this question is. However, it is possible to attack the problem in several different ways, and there are various different (but logically equivalent) ways of stating it. I'll post some of these as an answer now, as it seems rather long to fit in the original statement of the question. Maybe one of the reformulations below will help lead to a resolution of the problem.

This answer is already very long, and I've been trying to shorten it as much as I can. I can't see any way of giving proper proofs of all the statements below without making it a lot longer. So, I'll only give very few details of he proofs here. I will, however, list each of the equivalent formulations H1-H6 below in the logical order, so that each statement can be proven to be equivalent to the preceding one without too much work. Let's start from statement 2 of the original question, which I will refer to as Hypothesis (H1) for the purposes of this answer.

Hypothesis (H1): Let $f\colon\mathbb{R}^+\times\mathbb{R}\to\mathbb{R}$ be such that $f(t,x)$ is convex in $x$ and decreasing in $t$. Then, $f=g-h$ where $g(t,x)$ and $h(t,x)$ are both convex in $x$ and increasing in $t$.

The decomposition in (H1) exists if and only if it exists locally. Letting $I=[0,1]$ denote the unit interval, we get the following equivalent statement.

Hypothesis (H2): Let $f\colon I^2\to\mathbb{R}$ be such that $f(t,x)$ is convex and Lipschitz continuous in $x$ and decreasing in $t$. Then, $f=g-h$ where $g(t,x)$ and $h(t,x)$ are convex in $x$ and increasing in $t$.

The truth of this statement remains unchanged if it is restricted to functions ƒ which are zero on the three edges $I\times\{0\}$, $I\times\{1\}$, $\{0\}\times I$ of the unit square. I'll use $D$ to denote the set of such functions satisfying the conditions of (H2). Then, whenever the decomposition in (H2) exists, it is always possible to choose $g$, $h$ to be zero on $I\times\{0\}$, $I\times\{1\}$ and $\{1\}\times I$. From now on, whenever the decomposition $f=g-h$ in (H2) is referred to, it will be assumed that $g$, $h$ are chosen to satisfy these conditions.

We can strengthen (H2) by also placing a uniform bound on the terms $g$, $h$ in the decomposition. Here, $\lVert g\rVert$ denotes the supremum norm and $f_x$ denotes the partial derivative.

Hypothesis (H3): There is a constant $K\gt0$ such that, for all $f\in D$, the decomposition $f=g-h$ as in H2 exists and can be chosen such that $\lVert g\rVert,\lVert h\rVert\le K\lVert f_x\rVert$.

Statement (H3) is particularly convenient because of the following: to prove that (H3) holds, it is enough to look at a dense subset of functions in $D$. Taking limits of the decompositions would then extend to the result to all of $D$. So, it is enough to concentrate on, say, smooth functions or piecewise linear functions.

Next, it is useful to choose the decomposition in (H2) to minimize $\lVert g\rVert$ and $\lVert h\rVert$.

Lemma 1: Suppose that $f\in D$ and that the decomposition $f=g-h$ as in (H2) exists. Then, there is a unique maximal choice for $g$, $h$. That is, if $f=g_1-h_1$ is any other such decomposition then $g\ge g_1$ and $h\ge h_1$.

I'll refer to the decomposition in Lemma 1 as the optimal decomposition. As it not clear that any such decomposition should exist, I'll briefly sketch the argument now. The idea is to discretize time, using a partition of the unit interval $0=t_0\lt t_1\lt\cdots\lt t_r=1$. Denote the convex hull of a function $u\colon I\to\mathbb{R}$ by $v=H(u)$, which is the maximum convex function $v\colon I\to\mathbb{R}$ bounding $v$ from below, $$ \begin{align} v(x) &= \sup\left\{w(x)\colon w\le u{\rm\ is\ convex}\right\}\\ &=\inf\left\{\left((b-x)u(a)+(x-a)u(b)\right)/(b-a)\colon a\le x\le b\right\}. \end{align} $$ The optimal decomposition in discrete time can be constructed as functions $h_k\colon I\to\mathbb{R}$, starting at the final time $k=r$ and working backwards to $k=0$, $$ h_r(x) = 0,\ h_{k-1}={\rm H}\left(h_k+f(t_k,\cdot)-f(t_{k-1},\cdot)\right). $$ Interpolate this to be piecewise constant in time, defining $h(t,x)=h_k(x)$ for $t_{k-1}\lt t\le t_k$. Then, $h(t,x)$ and $g\equiv f+h$ are convex in $x$ and increasing in time $t$, restricting to times in the partition.

Lemma 2: Suppose that $f\in D$ and let $0=t_{n,0}\lt t_{n,1}\lt\cdots\lt t_{n,r_n}=1$ be a sequence of partitions of the unit interval. For each $n$ let $h^n(t,x)$ be the function corresponding to the partition and piecewise constant in $t$, as constructed above. We also suppose that the partitions have mesh going to zero and eventually include all times at which $f$ is discontinuous. Then one of the following holds.

  • $f$ decomposes as in (H2) and $h^n(t,x)\to h(t,x)$ pointwise on $I^2$, where $f=g-h$ is the optimal decomposition.
  • $f$ does not have a decomposition as in (H2) and $h^n(0,x)\to-\infty$ for all $0\lt x\lt1$.

The idea is that, if $h^n$ has any limit point $h$, then $h(t,x)$ and $g=f+h$ will be convex in $x$ and increasing in $t$, giving the decomposition required by (H2). Furthermore, by construction, if $f=g^\prime-h^\prime$ is any other such decomposition, then $h^n\ge h^\prime$ at times in the partition, so $h\ge h^\prime$. This shows that the decomposition is optimal and, as the optimal decomposition is unique, all limit limit points of $h^n$ are the same, so $h^n\to h$. The only alternative is that $h^n$ has no limit points, in which case the second statement of the Lemma holds. Using this construction of the optimal decomposition, (H3) can be shown to be equivalent to the following.

Hypothesis (H4): There is a constant $K\gt0$ such that, for all smooth functions $f,g\colon I^2\to\mathbb{R}$ with $\lVert f\rVert$, $\lVert g\rVert$, $\lVert f_x\rVert$, $\lVert g_x\rVert$ bounded by 1 and $f(t,x)$, $g(t,x)$ convex in $x$ and respectively decreasing and increasing in $t$, then, $$\int_0^1\int_0^1 f_{xx}g_t\,dxdt \le K.$$

As this statement is quite different from the preceding ones, I'd better give some explanation now. The idea is to use integration by parts, $$ \begin{align} \int_0^1\int_0^1(f_{xx}g_t+g_{xx}f_t)\,dxdt &= \left[\int_0^1(f_xg_t+g_xf_t)\,dt\right]_{x=0}^1-\left[\int_0^1f_xg_x\,dx\right]_{t=0}^1\\ &\le 6(\Vert f_x\Vert \Vert g\Vert + \Vert g_x\Vert \Vert f\Vert). \end{align} $$ If $f$ and $g$ are increasing in time then the terms on the left hand side are both positive, so we get bounds for the integrals of $f_{xx}g_t$ and $g_{xx}f_t$ individually. Hypothesis (H3) extends this to the case where $f$ is decreasing in time, implying (H4).

Conversely, suppose that (H4) holds. Letting $f=g-h$ be the decomposition computed along a partition as described above, we can use the fact that the convex hull $v=H(u)$ of a function $u$ satisfies $v_{xx}(u-v)=0$ to get the equality $(h_{k-1})_{xx}(h_k-h_{k-1}+f(t_k,\cdot)-f(t_{k-1},\cdot))=0$. This leads to the following inequalities,

$$ \begin{align} \frac12\Vert h\Vert^2&\le\frac12\int_0^1 h_x(0,x)^2\,dx\le\sum_{k=1}^r \int_0^1(h_{k-1})_{xx}(h_{k-1}-h_k)\,dx\\ &=-\sum_{k=1}^r\int_0^1 (h_{k-1})_{xx}(f_k-f_{k-1})\,dx. \end{align} $$ Hypothesis (H4) can be used to bound the final term, showing that $h$ cannot diverge as the mesh of the partition goes to zero, so we get convergence to an optimal decomposition satisfying a bound as in (H3).

The hypothesis can now be formulated as a statement about martingales.

Hypothesis (H5): There is a constant $K\gt0$ such that, for all $f\in D$ and martingales $0\le X_t\le1$, $f(t,X_t)$ decomposes as $M+V$ where $M$ is a martingale and $V$ has variation $$ \mathbb{E}\left[\int_0^1\,\vert dV\vert\right]\le K\Vert f_x\Vert. $$

The idea is that $g(t, x)\equiv\mathbb{E}[(X_t-x)_+]$ is convex in $x$ and increasing in $t$. In the case where $f$, $g$ are smooth and $X$ is a continuous martingale, Ito's formula can be used to split $f(t,X_t)$ into a martingale term plus the sum of the increasing process $\frac12\int f_{xx}(t,X_t)\,d[X]_t$ and the decreasing process $\int f_t(t,X_t)\,dt$, which have expectations $\iint f_{xx}g_t\,dxdt$ and $\iint f_tg_{xx}\,dxdt$ respectively.

Finally, by adding a randomly occurring jump term to any semimartingale, and with a bit more work, it is possible to reduce this to the martingale case. This gives the statement asked in the original question.

Hypothesis (H6): Let $\colon\mathbb{R}^2\to\mathbb{R}$ be such that $f(t,x)$ is convex in $x$ and continuous and decreasing in $t$. Then, for any semimartingale $X$, $f(t,X_t)$ is a semimartingale.

Related Question