Ito’s lemma 2nd order term notation.


I have a notation question here.

In the simplest form of Ito's lemma, we have this

$ df(Y_t) = f'(Y_t) dY_t + \frac{1}{2} f''(Y_t) d\langle Y \rangle_t$

I know how to calculate the $ d\langle Y \rangle_t $ term, but I always want to ask

  • what is the name of the term, and what exactly it means?
  • why is it written in such a special way but not using $ Cov() $, or $ Var() $?

Conceptually to me that's the variance of the process but I just don't understand the notation. Why the subscript $ t $ is being put outside the $ \langle \cdot \rangle $.

Can I write it like any of these below?

$ \langle dY_t \rangle $

$ d \langle Y_t \rangle $

If there are two processes involved, following the pattern I guess it should be written like this $ d\langle X, Y \rangle_t $, but can I write it like these below?

$ \langle dX_t, dY_t \rangle $

$ d\langle X_t, Y_t \rangle $

Also can I write it in integral form? Where should I put the $ t $ if I am writing it in integral form?

Thanks a lot

Best Answer

Long-hand / Short-hand notation:

I personally have always found the short-hand notation confusing and to this day try to avoid it whenever possible. Below, I will try to demonstrate why it is confusing and leads to commonly made mistakes.

In the "long-hand" notation, an Ito process $X_t$ is defined as follows:

$$X_t:=X_0+\int_{h=0}^{h=t}a(X_h,h) dh + \int_{h=0}^{h=t}b(X_h,h) dW_h $$

Above, $a(X_t,t)$ and $b(X_t,t)$ are some square-integrable processes.

It is worth noting that the Quadratic variation of $X_t$ would then be:

$$\left<X\right>_t=\int_{h=0}^{h=t}b(X_h,h)^2dh $$

(this follows from the definition of Quadratic variation for Stochastic Processes, see edit at the end of this post)

Now, in short-hand notation, we can write the equation for $X_t$ above as:

$$dX_t=a(X_t,t) dt + b(X_t,t) dW_t$$

Firstly, what does the short-hand notation really mean? We could define $\delta X_t$ as follows:

$$\delta X_t:=X_t-X_0=\int_{h=0}^{h=\delta t}a(X_h,h) dh + \int_{h=0}^{h=\delta t}b(X_h,h) dW_h$$

And then $dX_t$ could be (intuitively, not rigorously) understood along the lines of:

$$\lim_{\delta t \to 0} \delta X_t = dX_t$$

But I think it's best to just understand the short-hand notation for what it really is: i.e. a short-hand for the stochastic integrals.

Ito's Lemma:

Now Ito's Lemma states that for any such Ito process $X_t$, any twice-differentiable function $F()$ of $X_t$ and $t$ would obey the following equation:

$$F(X_t,t)=F(X_0,t_0)+\int_{h=0}^{h=t} \left( \frac{\partial F}{\partial t}+\frac{\partial F}{\partial X}*a(X_h,h) + 0.5\frac{\partial^2 F}{\partial X^2}*b(X_h,h)^2 \right)dh+\int_{h=0}^{h=t}\left(\frac{\partial F}{\partial X}b(X_h,h)\right)dW_h$$

Above, you can spot the "quadratic variation" term:

$$\int_{h=0}^{h=t}0.5\frac{\partial^2 F}{\partial X^2}b(X_h,h)^2 dh$$

(which, in "short-hand" notation could be written as $0.5F''(X_t)d\left<X\right>_t$, i.e. exactly the same as yours $0.5f''(Y_t) d\langle Y \rangle_t$, I just use $F$ instead of $f$ and $X_t$ instead of $Y_t$: again, I find the short-hand much less intuitive than the long-hand notation, even after years of playing around with Ito processes).

Why not to use Short-hand notation

Now I would like to show an example of why I think the short-hand notation can be super-confusing: Let's go with the Ornstein-Uhlenbeck process (below, $\mu$, $\theta$ and $\sigma$ are constant parameters):

$$X_t:=X_0+\int_{h=0}^{h=t}\theta(\mu- X_h)dh + \int_{h=0}^{h=t}\sigma dW_h $$

We have $a(X_t,t)=\theta(\mu- X_h)$ and $b(X_t,t) = \sigma$.

The trick to solving the above is to apply Ito's lemma to $F(X_t,t):=X_t e^{\theta t}$, which gives:

$$X_te^{\theta t}=F(X_0,t_0)_{=X_0}+\int_{h=0}^{h=t} \left( \frac{\partial F}{\partial t}_{=\theta X_h e^{\theta h}}+\frac{\partial F}{\partial X}_{=e^{\theta h}}*a(X_h,h) + 0.5\frac{\partial^2 F}{\partial X^2}_{=0}*b(X_h,h)^2 \right)dh+\int_{h=0}^{h=t}\left(\frac{\partial F}{\partial X}_{=e^{\theta h}}b(X_h,h)\right)dW_h=\\=X_0+\int_{h=0}^{h=t}\left(\theta X_h e^{\theta h}+e^{\theta h}\theta(\mu- X_h)\right)dh+\int_{h=0}^{h=t}\left(e^{\theta h} \sigma\right)dW_h=\\=X_0+\int_{h=0}^{h=t}\left(e^{\theta h}\theta\mu\right)dh+\int_{h=0}^{h=t}\left(e^{\theta h} \sigma\right)dW_h$$

Now, to get the solution for $X_t$, the final step is simply to divide both sides by $e^{\theta t}$, to isolate the $X_t$ term on the LHS, which gives:

$$X_t=X_0e^{-\theta t}+\int_{h=0}^{h=t}\left(e^{\theta(h-t)}\theta\mu\right)dh+\int_{h=0}^{h=t}\sigma e^{\theta(h-t)} dW_h$$

I have seen many people trying to solve the Ornstein-Uhlenbeck writing everything out using the "short-hand" notation, and in the last step, when we divide through by $e^{\theta t}$, I have seen people "cancelling out" the terms that would normally be written as $e^{\theta h}$ inside the integrals: because the short hand notation fails to distinguish between what is an integration dummy variable (i.e. "$h$") and what had already been integrated to "$t$".

In conclusion, I wouldn't recommend using the short hand notation for SDEs, and if you come across it, I would encourage "translating it" into what it really means (i.e. the "long-hand" notation): at least for me, it has made things a lot easier to comprehend.

Edit on Quadratic Variation: Quadratic variation for Stochastic Processes is defined as a limit in Probability as the mesh-size gets finer and finer, specifically for a Brownian motion, we could write $\forall \epsilon > 0$:

$$\left<W\right>_t:=\lim_{n \to \infty} \mathbb{P}\left(\left|\sum_{i=1}^{i=n}\left(W_{t_i}-W_{t_{i-1}}\right)^2-t\right|>\epsilon\right)=0$$

I.e. the probability that the Quadratic variation converges to $t$ goes to 1 as the mesh size gets infinitely fine (the proof is rather technical, see for example here, where they actually seem to prove convergence almost surely (which implies convergence in probability)).

Notice that we can then simply write:

$$t=\int_{h=0}^{h=t}dh$$ and thereby obtain the well-known formula:

$$ \left< W \right>_t=\int_{h=0}^{h=t}dh=t$$

