Solved – MCMC with Metropolis-Hastings algorithm: Choosing proposal

markov-chain-montecarlometropolis-hastings

I need to do a simulation to evaluate an integral of a 3 parameter function, we say $f$, which has a very complicated formula. It is asked to use MCMC method to compute it and implement the Metropolis-Hastings algorithm to generate the values distributed as $f$, and it was suggested to use a 3 variated normal as proposal distribution. Reading some examples about it, I've seen that some of then use a normal with fixed parameters $N(\mu, \sigma)$ and some use with a variable mean $N(X, \sigma)$, where $X$ is the last accepted value as distributed according $f$. I have some doubts about both approaches:

1) What is the meaning of choosing the last accepted value as the new mean of our proposal distribution? My intuition says it should guarantee that our values will be closer to values distributed as $f$ and the chances of acceptance would be greater. But doesn't it concentrates too much our sample? It is guaranteed that, if I get more samples, the chain will become stationary?

2) Wouldn't choose fixed parameters (since the $f$ is really difficult to analyize) be really difficult and dependent of the first sample we need to choose to start the algorithm? In this case, what would be the best approach to find which one is better?

Is one of those approaches better than the other or this depends of the case?

I hope my doubts are clear and I would be glad if some literature could be given (I've read some papers about the theme, but more is better!)

Thanks in advance!

Best Answer

1) You could think about this method as a random walk approach. When the proposal distribution $x \mid x^t \sim N( x^t, \sigma^2)$, it is commonly referred to as the Metropolis Algorithm. If $\sigma^2$ is too small, you will have a high acceptance rate and very slowly explore the target distribution. In fact, if $\sigma^2$ is too small and the distribution is multi-modal, the sampler may get stuck in a particular mode and won't be able to fully explore the target distribution. On the other hand, if $\sigma^2$ is too large, the acceptance rate will be too low. Since you have three dimensions, your proposal distribution would have a covariance matrix $\Sigma$ which will likely require different variances and covariances for each dimension. Choosing an appropriate $\Sigma$ may be difficult.

2) If your proposal distribution is always $N(\mu, \sigma^2)$, then this is the independent Metropolis-Hastings algorithm since your proposal distribution does not depend on your current sample. This method works best if your proposal distribution is a good approximation of the target distribution you wish to sample from. You are correct that choosing a good normal approximation can be difficult.

Neither method's success should depend on the starting value of the sampler. No matter where you start, the Markov chain should eventually converge to the target distribution. To check convergence, you could run several chains from different starting points and perform a convergence diagnostic such as the Gelman-Rubin convergence diagnostic.

Related Question