Probability Theory – Markov Process and Transition Semigroup

I read in my lecture notes an introduction chapter about Markov process. Here is the setting:

Let $(\Omega,\mathcal{F},P)$ a probability space, $(S,\mathcal{S})$ a measurable space and $X=(X_t)$ a S-valued Markov process on $(\Omega,\mathcal{F},P)$ with transition semigroup $K=(K_t)$. Let $\nu$ be the law on $\mathcal{S}$ of $X_0$ under $P$ (the initial distribution), i.e.

$$\nu(A):=P(X_0^{-1}(A))$$

for all $A\in \mathcal{S}$. And by $\mathbb{P}_\nu$ we denote the distribution on $S^{[0,\infty)}$ of $X$ under $P$. Moreover on the space $S^{[0,\infty)}$ we define the coordinate process $Y=(Y_t)$ as, $Y_t:S^{[0,\infty)}\to S$, $\omega\mapsto Y_t(\omega):=\omega(t)$, and the $\sigma$-algebra $\mathcal{S}^{[0,\infty)}:=\sigma(Y_s;s\ge0)$.

There is a sentence in my lectures notes which seems to me not that obvious. They say: The law of $X$ under $P$ is the same as the law of $Y$ on $S^{[0,\infty)}$ under $\mathbb{P}_\nu$, i.e.

$$P(X_{t_1}\in A_1,\dots,X_{t_n}\in A_n)=\mathbb{P}_\nu(Y_{t_1}\in A_1,\dots,Y_{t_n}\in A_n)$$

for $A_i\in \mathcal{S}$.

Unfortunately for me it is not that obvious. Therefore a short clarification would be appreciated.

cheers

math

Best Answer

As everybody knows, the entirety of mathematics is nothing but a concatenation of rigorous trivialities... and the solution below is no exception.

Call $T=[0,\infty)$, $E=S^T$ and $\mathcal E=\mathcal S^{\otimes T}$, and let $X:\Omega\to E$ denote any function measurable with respect to $\mathcal F$ and $\mathcal E$. Equivalently, $X^{-1}(B)$ is in $\mathcal F$ for every $B$ in $\mathcal E$. Still equivalently, since $\mathcal E$ is a product sigma-algebra, $X_t^{-1}(B_t)$ is in $\mathcal F$ for every $t$ in $T$ and every $B_t$ in $\mathcal S$, where by definition $X=(X_t)_{t\in T}$ hence $X_t:\Omega\to S$ for each $t$ in $T$.

(There is a subtlety here, which is that the former condition (on $X$) is a priori stronger than the latter (on every $X_t$) since $T$ is uncountable. It happens that the two conditions become equivalent (as asserted above) as soon as one adds some regularity hypothesis on the process $X=(X_t)_t$, for example the hypothesis that the paths $t\mapsto X_t(\omega)$ are almost surely càdlàg, or the Markovianity hypothesis mentioned in the post. We now forget these subtleties and proceed.)

For every probability measure $\nu$ on $(S,\mathcal S)$, if $\nu$ is the image of $P$ by $X_0$, one calls $\mathbb P_\nu$ the probability measure on $(E,\mathcal E)$ image of $P$ by $X$, that is, $\mathbb P_\nu=P\circ X^{-1}$.

For each $t$ in $T$, $Y_t:E\to S$ is defined by $Y_t(x)=x(t)$ for every $x$ in $E$ defined as $x:T\to S$, $t\mapsto x(t)$. (Note that the choice of $\omega$ as the generic element of $E=S^T$, as in the post above, might be seen as unfortunate in such a context since another set $\Omega$, a priori different from $E$, is in the picture.)

In other words, introducing $Y=(Y_t)_{t\in T}$, one sees that $Y:\color{red}{E}\to \color{blue}{E}$ is simply the identity function $Y:\color{red}{x}\mapsto \color{blue}{x}$. In particular, $Y$ is indeed measurable as a function $Y:(\color{red}{E},\color{red}{\mathcal E})\to(\color{blue}{E},\color{blue}{\mathcal E})$ and, as soon as $(\color{red}{E},\color{red}{\mathcal E})$ is endowed with any probability measure $\color{red}{Q}$, then the distribution of $Y$ becomes $\color{blue}{Q}=\color{red}{Q}\circ Y^{-1}$ on $(\color{blue}{E},\color{blue}{\mathcal E})$.

In particular, under $\mathbb P_\nu$, the distribution of $Y=(Y_t)_{t\in T}$ is $\mathbb P_\nu$. Since $\mathbb P_\nu$ was defined from the start as the distribution of $X$, you are done.

Best Answer

Related Solutions

Related Question