Solved – Multi parameter Metropolis-Hastings

bayesianmarkov-chain-montecarlometropolis-hastingsmultivariate analysis

I need to formulate a multi parameter Metropolis-Hastings algorithm.

My question is related to how to define the condition to accept or not the candidate value.

In my problem (it is a curve fitting) I have 5 parameters $\theta=(a_1,a_2,a_3,a_4,a_5)$, some of them are associated to a informative Prior distribution, others to a non-informative Prior distribution.

PRIOR DISTRIBUTIONS:

  • $a_1 \sim$ Log-Normal$(\mu_1,\sigma_1)$
  • $a2 \sim$ Log-Normal$(\mu_2,\sigma_2)$
  • $a3 \sim$ Uniform(lower3,upper3)
  • $a4 \sim$ Uniform(lower4,upper4)
  • $a_5 \sim$ Uniform(lower5,upper5)

Moreover I calculate the likelihood estimation for the parameters, lets call it Like($\theta$).

To decide if to accept or not the candidate value at time t, I have in mind three options, but I don't know which one is correct, because I could find only algorithms with 1 parameter:

  • define the ratio R for each parameter:

$R_i=\frac{Like(\theta)_t Prior(a_i)_t}{Like(\theta)_{t-1} Prior(a_i)_{t-1}}$

and accept the vector theta at time t only if ALL the $R_i$ are higher than 1 or than a random number between 0 and 1.

  • define the ratio R for each parameter:

$R_i=\frac{Like(\theta)_t Prior(a_i)_t}{Like(\theta)_{t-1} Prior(a_i)_{t-1}}$

and accept the parameter $a_i$ at time t if the $R_i$ are higher than 1 or than a random number between 0 and 1.

  • define the ratio R as:

$R=\frac{Like(\theta)_t Prior(a_1)_t Prior(a_2)_t Prior(a_3)_t Prior(a_4)_t Prior(a_5)_t}{Like(\theta)_{t-1} Prior(a_1)_{t-1} Prior(a_2)_{t-1} Prior(a_3)_{t-1} Prior(a_4)_{t-1} Prior(a_5)_{t-1}}$

and accept the the vector theta at time t only if the $R$ is higher than 1 or than a random number between 0 and 1.

Can somebody give me an answer or a reference in which a case like this has been developed?

Thanks

Best Answer

You actually have a single joint prior, which is a function of the parameter vector $\theta = [a_1, ..., a_d]$. If the parameters are treated independently, the prior factorizes into a product of the 'individual priors' that you mentioned. That is:

$$p(\theta) = \prod_{i = 1}^d p(a_i)$$

Classic Metropolis-Hastings looks the same whether you have a single parameter or multiple parameters; in the multi-parameter case, you just consider the parameter vector as a single object.

Let $\theta_t$ be the current parameter vector (at step $t$), $\theta'$ be a new candidate parameter vector drawn from the proposal distribution, and $D$ be the data. Calculate the ratio:

$$R_t = \frac{p(D \mid \theta') p(\theta')}{p(D \mid \theta_t) p(\theta_t)}$$

Accept $\theta'$ if $R_t \ge 1$, otherwise accept it with probability $R_t$.

Classic Metropolis-Hastings can be slow to converge in high dimensions. If this is a problem, more advanced techniques like Hamiltonian Monte Carlo can be used.

Related Question