Remember a stochastic function is a measurable function. In this case the function will be from $S^{N}$ onto $S$. But being a function of $S^{N}$ means you are a function of the sequence $w_{0},w_{1},...$ if you shift that sequence n terms you just get the sequence $w_{n},w_{n+1},..$. Hence the h-notation.
You can obtain the R.H.S by using the definition of $P_{\mu}$ with the dirac measure at some x (creating a function of x) and then take the composite function with your stochastic function $X_{n}$. Intuitively it corresponds to just starting the Markov Chain at $X_{n}$ - and as the good markov chain it is, it doesn't know that it isn't just started, so it's the same as if you had started from the beginning and skipped the first n w's.
Every indicator is bounded why this case contains the normal definition and is actually equivalent - I think this can be easily obtained via the standard proof.
I guess that's the logic behind it - I asked more or less the same question as you a little while back, you might wanna look at that.
Yes, they are equivalent. Let's assume that
$$\mathbb{P}(X_n \in A \mid \sigma(X_1,\ldots,X_{n-1}) = \mathbb{P}(X_n \in A \mid \sigma(X_{n-1})) \tag{1}$$
holds for all measurable sets $A$ and all $n \in \mathbb{N}$. By a standard approximation procedure, this implies $$\mathbb{E}(f(X_n) \mid \sigma(X_1,\ldots,X_{n-1}) ) = \mathbb{E}(f(X_n) \mid \sigma(X_{n-1})) \tag{1'}$$ for any bounded Borel-measurable function $f$.
For fixed $n \in \mathbb{N}$ we prove $$\mathbb{P}(X_n \in A \mid \sigma(X_1,\ldots,X_{n-k})) = \mathbb{P}(X_n \in A \mid \sigma(X_{n-k})), \qquad A \in \mathcal{A}, \tag{2}$$ by induction over $k=1,\ldots,n$.
Base: For $k=1$ this is nothing but $(1)$.
Inductive step: Assume that $(2)$ holds for some $k=1,\ldots,j$; we have to show that $(2)$ holds for $k=j+1$. By the tower property of conditional expectation, we have
$$\mathbb{P}(X_{n} \in A \mid \sigma(X_1,\ldots,X_{n-j-1})) = \mathbb{E} \bigg[ \mathbb{P}(X_k \in A \mid \sigma(X_1,\ldots,X_{n-j})) \mid \sigma(X_1,\ldots,X_{n-j-1}) \bigg].$$
Using our induction hypothesis, we find
$$\mathbb{P}(X_{n} \in A \mid \sigma(X_1,\ldots,X_{n-j-1})) = \mathbb{E} \bigg[ \mathbb{P}(X_n \in A \mid \sigma(X_{n-j})) \mid \sigma(X_1,\ldots,X_{n-j-1}) \bigg].$$
By the factorization lemma, there exists a measurable function $f$ such that
$$\mathbb{P}(X_n \in A \mid \sigma(X_{n-j})) = f(X_{n-j}),$$
and so
$$\mathbb{P}(X_{n} \in A \mid \sigma(X_1,\ldots,X_{n-j-1})) = \mathbb{E}(f(X_{n-j}) \mid \sigma(X_1,\ldots,X_{n-j-1})).$$
It follows from $(1')$ that
$$\mathbb{P}(X_{n} \in A \mid \sigma(X_1,\ldots,X_{n-j-1})) = \mathbb{E}(f(X_{n-j}) \mid \sigma(X_{n-j-1})). \tag{3}$$
If we take on both sides the conditional expectation with respect to $\sigma(X_{n-j-1})$, then we find that
$$\mathbb{P}(X_n \in A \mid \sigma(X_{n-j-1})) = \mathbb{E}(f(X_{n-j}) \mid \sigma(X_{n-j-1})). \tag{4}$$
Combining $(3)$ and $(4)$ we get
\begin{align*} \mathbb{P}(X_{n} \in A \mid \sigma(X_1,\ldots,X_{n-j-1})) &\stackrel{(3)}{=} \mathbb{E}(f(X_{n-j}) \mid \sigma(X_{n-j-1})) \\ &\stackrel{(4)}{=} \mathbb{P}(X_n \in A \mid \sigma(X_{n-j-1})), \end{align*}
i.e. $(2)$ holds for $k=j+1$.
Best Answer
Taking the conditional expectation (with respect to $X_m$) at both sides of the Markov property
$$E_{\mu}(Y \circ \theta_m \mid \mathcal{F}_m) = E_{X_m}(Y) \tag{1}$$
we get
$$E_{\mu}(Y \circ \theta_m \mid X_m) = E_{X_m}(Y). \tag{2}$$
Combining $(1)$ and $(2)$ shows
$$E_{\mu}(Y \circ \theta_m \mid \mathcal{F}_m) = E_{\mu}(Y \circ \theta_m \mid X_m) \tag{3}$$
for any bounded measurable random variable $Y$. In particular, this identity holds for
$$Y = \prod_{j=1}^k 1_{C_j}(X_j)$$
where $C_j$ are measurable sets. Then
$$Y \circ \theta_m = \prod_{j=1}^k 1_{C_j}(X_{j+m}),$$
and so $(3)$ shows
$$E_{\mu} \left( \prod_{j=1}^k 1_{C_j}(X_{j+m}) \mid \mathcal{F}_m \right) = E_{\mu} \left( \prod_{j=1}^k 1_{C_j}(X_{j+m}) \mid X_m \right). \tag{4}$$
Since sets of the form $$\bigcap_{j=1}^k \{X_{j+m} \in C_j\}$$ are a $\cap$-stable generator of $\sigma(X_m,X_{m+1},\ldots)$, this implies
$$E_{\mu}(1_B \mid \mathcal{F}_m) = E_{\mu}(1_B \mid X_m)$$
for any $B \in \sigma(X_m,X_{m+1},\ldots)$.