Solved – MCMC chain getting stuck

bayesiangibbshierarchical-bayesianmarkov-chain-montecarlometropolis-hastings

I am trying to use a Metropolis-within-Gibbs type algorithm to sample $\theta$ and $x$ from the following model. Starting with Bayes theorem I can write:
$$
P(\theta, x | y) = \frac{P(y | x, \theta) P(\theta)}{P(y)}
$$
I can easily draw samples from $P(\theta)$ and $P(x | \theta)$, so I factor $P(y | x, \theta)$ into $P(y | x) P(x | \theta)$.

Bayes theorem again is now,
$$
P(\theta, x | y) = \frac{P(y | x) P(x | \theta) P(\theta)} {P(y)}
$$

  • $P(y | x)$ is multivariate Gaussian.
  • $P(x | \theta)$ is multivariate t.
  • $P(\theta)$ is uniform.

My sampling steps:

  1. Start with some value of $\theta$, say, $\theta_1$.

  2. Gibbs step: Draw a sample $x_1$ from $P(x | \theta_1)$.

  3. Plug my $x_1$ into $P(y | x)$ and evaluate (y was observed data).

  4. Metropolis step: Accept $\theta$, $x_1$, if $P(y | x_1) > P(y | x_0)$. If not, accept $\theta$, $x_1$ with probability $a = \frac{P(y | x_1)}{P(y | x_0)}$.

  5. Go back to step 2, using $\theta_1$ if it was accepted, otherwise, draw a new $\theta$ from the proposal, then go back to step 2.

The problem is that $\theta$ gets stuck. Sometimes I will get a draw of $x$ from $P(x | \theta)$, where that $x$ will be nearly identical to y, making result from $P(y | x)$ exceptionally high. After that happens, essentially all my steps are rejected. $a = \frac{P(y | x_1)}{P(y | x_0)}$ for pretty much all subsequent $\theta$ steps is a very small number.

Here is why I think this happens: The likelihood $P(y | x)$ is very sharply peaked. On the other hand the distribution $P(x | \theta)$ is very broad in comparison. When $\theta$ changes a small amount, $P(x | \theta)$ is essentially unchanged.

Would a different sampling strategy work better in such a situation? Let me know if further info is required.

Best Answer

In step 4, you don't have to reject the proposal $x,\theta$ every time its new likelihood is lower; if you do so, you are doing a sort of optimization instead of sampling from the posterior distribution.

Instead, if the proposal is worse then you still accept it with an acceptance probability $a$.

With pure Gibbs sampling, the general strategy to sample this would be:

Gibbs

Iteratively sample: \begin{align} p(x | \theta, y) &\propto p(y | x) p(x |\theta)\\ p(\theta | x, y) &\propto p(x | \theta) p(\theta) \end{align}

Gibbs with Metropolis steps for non-conjugate cases:

If you some of the conditionals above is not a familar distribution (because you are multiplying non-conjugates; this is your case) you can sample with Metropolis Hastings:

  • From the current $x$, generate some proposal, e.g.: $$ x^* \sim \mathcal{N(x, \sigma)} $$

  • Accept $x^*$ with probability [1]: $$ a = min \left(1, \frac{p(x^*)}{p(x)}\right) = min \left(1, \frac{p(x^* | \theta, y)}{p(x | \theta, y)}\right) $$

[1] If the proposal distribution wasn't symmetric then there is another multiplying factor.

Appendix: $$ p(\theta | x, y) = \frac{p(y|x)p(x| \theta)p(\theta)} {\int p(y|x)p(x| \theta)p(\theta) \text{d}\theta}= \frac{p(x| \theta)p(\theta)} {\int p(x| \theta)p(\theta) \text{d}\theta} \propto p(x| \theta)p(\theta) $$

Related Question