Is the joint chain formed of a Metropolis-Hastings chain and the corresponding proposal sequence a Markov chain

markov chainsmarkov-processmeasure-theoryprobability theory

Let $(X_n)_{n\in\mathbb N_0}$ denote the Markov chain generated by the Metropolis-Hastings algorithm$^1$ with proposal kernel $Q$ and target distribution $\mu$, $(Y_n)_{n\in\mathbb N}$ denote the corresponding proposal sequence and $$Z_n:=(X_{n-1},Y_n)\;\;\;\text{for }n\in\mathbb N.$$

How can we show that $(Z_n)_{n\in\mathbb N}$ is a time-homogeneous Markov chain and how can we determine its transition kernel?

By definition, $$Z_n\sim\mathcal L(X_{n-1})\otimes Q\tag1$$ and $$(X_{n-1},X_n)\sim\mathcal L(X_{n-1})\otimes\kappa\tag2$$ for all $n\in\mathbb N$. Moreover, there is a real-valued process $(U_n)_{n\in\mathbb N}$ independent of $(Z_n)_{n\in\mathbb N}$ with $U_n\sim\mathcal U_{[0,\:1]}$ and $$X_n=\begin{cases}X_{n-1}&\text{on }\{α(X_{n-1},Y_n)<U\}\\Y_n&\text{on }\{α(X_{n-1},Y_n)\ge U\}.\tag1\end{cases}$$

Now let $n\in\mathbb N$ and $A_n,B_{n+1}\in\mathcal E$. Since $U_n$ and $(Z_n,Y_{n+1})$ are independent, $$\operatorname P\left[Y_{n+1}\in B_{n+1}\mid U_n,Z_n\right]=\operatorname P\left[Y_{n+1}\in B_{n+1}\mid Z_n\right]\tag2.$$ Moreover, $$\operatorname P\left[U_n\le\alpha(X_{n-1},Y_n),Z_{n+1}\in A_n\times B_{n+1}\in U_n,Z_n\right]=\delta_{Y_n}(A_n)1_{[0,\:\alpha(X_{n-1},\:Y_n)]}(U_n)\operatorname P\left[Y_{n+1}\in B_{n+1}\mid Z_n\right]\tag3$$ and $$\operatorname P\left[U_n>\alpha(X_{n-1},Y_n),Z_{n+1}\in A_n\times B_{n+1}\mid U_n,Z_n\right]=\delta_{X_{n-1}}(A_n)1_{(\alpha(X_{n-1},\:Y_n),\:1]}(U_n)\operatorname P\left[Y_{n+1}\in B_{n+1}\mid Z_n\right]\tag4.$$ Thus, $$\operatorname P\left[Z_{n+1}\in A_n\times B_{n+1}\mid U_n,Z_n\right]=\left(\delta_{Y_n}(A_n)1_{[0,\:\alpha(X_{n-1},\:Y_n)]}(U_n)+\delta_{X_{n-1}}(A_n)1_{(\alpha(X_{n-1},\:Y_n),\:1]}(U_n)\right)\operatorname P\left[Y_{n+1}\in B_{n+1}\right]\tag5$$ and hence $$\operatorname P\left[Z_{n+1}\in A_n\times B_{n+1}\mid Z_n\right]=\left(\delta_{Y_n}(A_n)\operatorname E\left[1_{[0,\:\alpha(X_{n-1},\:Y_n)]}(U_n)\mid Z_n\right]+\delta_{X_{n-1}}\operatorname E\left[1_{(\alpha(X_{n-1},\:Y_n),\:1]}(U_n)\mid Z_n\right]\right)\operatorname P\left[Y_{n+1}\in B_{n+1}\mid Z_n\right]\tag6.$$

So, all what's left is to determine $\operatorname E\left[1_{[0,\:\alpha(X_{n-1},\:Y_n)]}(U_n)\mid Z_n\right]$, $\operatorname E\left[1_{(\alpha(X_{n-1},\:Y_n),\:1]}(U_n)\mid Z_n\right]$ and $\operatorname P\left[Y_{n+1}\in B_{n+1}\mid Z_n\right]$.

EDIT: It's straightforward to show that $$\operatorname E\left[1_{[0,\:\alpha(X_{n-1},\:Y_n)]}(U_n)\mid Z_n\right]=\alpha(X_{n-1},Y_n)\tag7$$ and $$\operatorname E\left[1_{(\alpha(X_{n-1},\:Y_n),\:1]}(U_n)\mid Z_n\right]=1-\alpha(X_{n-1},Y_n)\tag8.$$ Using the result of this question, we obtain \begin{equation}\begin{split}&\operatorname P\left[Y_{n+1}\in B_{n+1}\mid Z_n\right]\\&\;\;\;\;\;\;\;\;\;\;\;\;=(1-\alpha(X_{n-1},Y_n))Q(X_{n-1},B_{n+1})+\alpha(X_{n-1},Y_n)Q(Y_n,B_{n+1}.\end{split}\tag9\end{equation}

Thus, \begin{equation}\begin{split}&\operatorname P\left[Z_{n+1}\in A_n\times B_{n+1}\mid Z_n\right]\\&=\left(\left(1-\alpha(X_{n-1},Y_n)\right)\delta_{X_{n-1}}(A_n)+\alpha(X_{n-1},Y_n)\delta_{Y_n}(A_n)\right)\left(\left(1-\alpha(X_{n-1},Y_n)\right)Q(X_{n-1},B_{n+1})+\alpha(X_{n-1},Y_n)Q(Y_n,B_{n+1})\right).\end{split}\tag{10}\end{equation}

However, the correct transition kernel of $(Z_n)_{n\in\mathbb N}$ should be $$((x_{n-1},y_n),A_n\times B_{n+1})\mapsto\delta_{y_n}(A_n)Q(y_n,B_{n+1})\alpha(x_{n+1},y_n)+\delta_{x_{n-1}}(A_n)Q(x_{n-1},B_{n+1})(1-\alpha(x_{n+1},y_n))\tag{11}$$ instead. Which of my equations $(1)$ to $(10)$ is wrong? I can't find my mistake. And how can we prove that $(11)$ is indeed the transition kernel?


$^1$ To be precise, let

  • $(E,\mathcal E,\lambda)$ be a measure space;
  • $p:E\to[0,\infty)$ be $\mathcal E$-measurable with $$\int p\:{\rm d}\lambda=1$$ and $\mu:=p\lambda$;
  • $q:E^2\to[0,\infty)$ be ${\mathcal E}^{\otimes2}$-measurable and $$Q(x,\;\cdot\;):=q(x,\;\cdot\;)\lambda\;\;\;\text{for }x\in E;$$
  • $$\alpha(x,y):=\left.\begin{cases}\displaystyle1\wedge\frac{p(y)q(y,x)}{p(x)q(x,y)}&\text{, if }p(x)q(x,y)\ne0\\1&\text{, otherwise}\end{cases}\right\}\;\;\;\text{for }x,y\in E;$$
  • $$r(x):=1-\int Q(x,{\rm d}y)\alpha(x,y)\;\;\;\text{for }x\in E;$$
  • $$\kappa(x,B):=\int_BQ(x,{\rm d}y)\alpha(x,y)+r(x)1_B(x)\;\;\;\text{for }(x,B)\in E\times\mathcal E.$$

Best Answer

My mistake was that I claimed that, by definition of the algorithm, $(U_n)_{n\in\mathbb N}$ is independent of $(Z_n)_{n\in\mathbb N}$ (which is obviously wrong) and never thought about it again. What's correct is that $U_1,\ldots,U_n$ is independent of $Z_1,\ldots,Z_n$. In particular, $Y_{n+1}$ is not independent of $U_n$ (which is really obvious) and hence $(2)$, and everything derived from that, is wrong.

However, by the same argumentation as before - but with more care - we obtain $$\operatorname P\left[X_n\in A_n\mid U_n,Z_n\right]=\delta_{Y_n}(A_n)1_{[0,\:\alpha(X_{n-1},\:Y_n)]}(U_n)+\delta_{X_{n-1}}(A_n)1_{(\alpha(X_{n-1},\:Y_n),\:1]}(U_n)\tag{12}$$ almost surely and $$\operatorname P\left[Y_{n+1}\in B_{n+1}\mid U_n,Z_n\right]=1_{[0,\:\alpha(X_{n-1},\:Y_n)]}(U_n)Q(Y_n,B_{n+1})+1_{(\alpha(X_{n-1},\:Y_n),\:1]}(U_n)Q(X_{n-1},B_{n+1})\tag{13}$$ almost surely. Thus, $$\operatorname P\left[Z_{n+1}\in A_n\times B_{n+1}\mid Z_n\right]=\delta_{Y_n}\alpha(X_{n-1},Y_n)Q(Y_n,B_{n+1})+\delta_{X_{n-1}}(A_n)(1-\alpha(X_{n-1},Y_n))Q(X_{n-1},B_{n+1})\tag{14}.$$

Related Question