[Math] Confused about Markov property

markov chainsprobability theory

The sample space is $\Omega$ with $\omega = (\omega_0, \omega_1, \ldots) \in \Omega$ an infinite sequence of a set $S$. So the measure space is $(S^{\mathbb{N}}, \mathcal{S}^{\mathbb{N}})$ where $\mathcal{S}$ is the $\sigma$-field of $S$.

The shift operator $\theta_n:\Omega \to \Omega$ acts like $(\theta_n \omega)_m = \theta_{n+m}$ (i.e. applying the shift $\theta_n$ to $\omega$ removes the first $n$ elements of the sequence.

Durrett's book says the Markov property is
$$E(Y\circ \theta_n|\mathcal{F}_n) = E_{X_n}Y$$
when $Y \in \mathcal{F}$ and is bounded.
He suggests thinking of $Y = h(\omega_0, \omega_1, \ldots)$ where $h$ is a bounded function in which case I think the Markov property can be written as
$$E(h(\omega_n, \omega_{n+1}, \ldots)|\mathcal{F}_n) = E_{X_n}(h(\omega_0, \omega_1, \ldots))$$
which I do not really understand.

I know that $\omega = (\omega_0, \omega_1, \ldots)$ is a sequence and it is is in the sample space and it is meant to represent a Markov chain, right? But when you look at the "normal" Markov property that
$$P(X_{n+1} = j | X_0, \ldots, X_n) = P(X_{n+1} = j | X_n)$$
and compare it to the property above, the right hand side includes all elements in the chain! I know the expectation is taken wrt X_n only but the $h(\omega_0, \ldots)$ includes all the chain.

Or is this $\omega$ different to the $X_n$?

Best Answer

Remember a stochastic function is a measurable function. In this case the function will be from $S^{N}$ onto $S$. But being a function of $S^{N}$ means you are a function of the sequence $w_{0},w_{1},...$ if you shift that sequence n terms you just get the sequence $w_{n},w_{n+1},..$. Hence the h-notation.

You can obtain the R.H.S by using the definition of $P_{\mu}$ with the dirac measure at some x (creating a function of x) and then take the composite function with your stochastic function $X_{n}$. Intuitively it corresponds to just starting the Markov Chain at $X_{n}$ - and as the good markov chain it is, it doesn't know that it isn't just started, so it's the same as if you had started from the beginning and skipped the first n w's.

Every indicator is bounded why this case contains the normal definition and is actually equivalent - I think this can be easily obtained via the standard proof.

I guess that's the logic behind it - I asked more or less the same question as you a little while back, you might wanna look at that.

Related Question