Solved – Metropolis-Hastings Algorithm within Gibbs Sampling

gibbsmarkov-chain-montecarlometropolis-hastingsmonte carloself-study

I have this $f$ function below.

$$
f(x_1,x_2)\propto \left(\dfrac{x_1}{x_2}\right)\left(\dfrac{\alpha}{x_2}\right)^{x_1-1}exp\left\{-\left(\dfrac{\alpha}{x_2}\right)^{x_1} \right\}I_{R^+}(x)
$$
where $\alpha > 0$.

I would like to estimate expected values and variances of $x_1$ and $x_2$ by using Gibbs Sampling or Metropolis-Hastings algorithms.

First of all, I tried to run Metropolis-Hastings algorithm, however, finding proper proposal distributions for both random variables is too hard. I tried some known distributions such as Gamma, Weibull or Exponentials, because they have the support of $x_1$ and $x_2$, I did not get any good results, considering the plots of Markov Chains I got were too bad.

Another solution that I tried was Gibbs sampling. I tried to find full conditionals, i.e., $f(x_1|x_2)$ and $f(x_2|x_1)$ which are $x_1$ given $x_2$ and $x_2$ given $x_1$, respectively.

I found $f(x_1|x_2)$ as follows:

$$
f(x_1|x_2)\propto x_1 \beta^{x_1}\exp\left\{-\left(\beta \right)^{x_1} \right\}
$$
where $\beta=\dfrac{\alpha}{x_2}$ and $0<x_1<\infty$.

I tried to derive a standard distribution by using transformation. Thus, I made the change of variable with $\left(\beta \right)^{x_1}=y$. In the end I get the function below.

$$
g(y|x_2)\propto ye^{-y}e^{-e^{-y}}
$$

The last two terms, here, resemble to a Gumbel Distribution with shape parameter $\mu=0$ and scale parameter $B=1$. Thus, Gumbell distribution has a support of $(-\infty, +\infty)$, that was not the case because, here $y$ has a support of $(-\infty, 0)$. So, I thought about truncating the Gumbel distribution, and then using acceptance rejection method to get random numbers of $y$'s. However, that did not work out, either because according to this method, the unknown $g(x)$ function, which is $y$ here, must be within $(0,1)$ to compare it with a uniform random number. So, I left it, here and thought maybe I can run MH for $x_1$ within Gibbs sampling. For that, once again, I truncated Gumbel distribution but this time for the support $(0, \infty)$. Because I assumed, since I can derive somewhat truncated Gumbel distribution for $f(x_1|x_2)$, this truncated Gumbel distribution might be a good proposal distribution for $x_1$, overall.

I got the probability distribution function (pdf) of truncated Gumbel distribution as given in below:

$$
f(x)=\frac{\dfrac{1}{B}e^{-{\dfrac{(x-\mu)}{B}+e^{-\dfrac{(x-\mu)}{B}}}}}{1-e^{-e^{\frac{\mu}{B}}}}
$$

and its cumulative distribution function (cdf) given as follows:

$$
F(x)=\dfrac{e^{-{e^{-\dfrac{(x-\mu)}{B}}}}-e^{-e^{\frac{\mu}{B}}}}{1-e^{-e^{\frac{\mu}{B}}}}
$$

Since I need to generate new random variates for MH algorithm, I need the inverse of this cdf. I will generate a new $z$ from $Uniform(0,1)$ and put it in the inverse of CDF of this distribution to get a random variates from truncated Gumbel distribution. The inverse cdf is given below:

$$
x=-B~ln\left\{-ln\left[z\left( 1-e^{-e^{\frac{\mu}{B}}} \right) + e^{-e^{\frac{\mu}{B}}} \right] \right\}+\mu
$$

Anyway, now it's turn for $f(x_2|x_1)$ which is $x_2$ given $x_1$.

$$
f(x_2|x_1)=\dfrac{1}{{x_2}^{x_1}}\exp\left\{-\dfrac{\gamma}{{x_2}^{x_1}} \right\}
$$
where $\gamma=\alpha^{x_1}$ and $0<x_1<\infty$. Thus, I made the change of variable with $\dfrac{1}{{x_2}^{x_1}}=t$. In the end I get the function below.

$$
g(t|x_1)\propto \exp\{-\gamma ~ t \}~t^{-\dfrac{1}{x_1}}
$$

At first, I thought this was a Gamma distribution with the shape parameter $a=-\dfrac{1}{x_1}+1$ and scale parameter $b=\gamma=\alpha^{x_1}$. It seems, no doubt, a gamma distribution just because here $x_1$ is a constant because it's given/known. But the problem comes when it is applied. Because when I do Metropolis-Hastings within Gibbs, things get a bit complicated. Since I am going to update $x_1$ and put it in conditional distribution $g(t|x_1)$ in order to get $t$'s random variates,so then, I am going to inverse the transformation $\dfrac{1}{{x_2}^{x_1}}=t$ in order to get $x_2$'s. However, the parameters of gamma distribution creates a big problem.

As you know Gamma distribution has two parameters which is greater than zero. Here, $b=\gamma=\alpha^{x_1}$ is not a big deal, because $0<x_1<\infty$ so $b>0$. HOWEVER, in order to get $a>0$, the shape parameter, I need to generate $x_1>1$ which is impossible because $0<x_1<\infty$.

I realized this problem long after I wrote and run the code, which giving me "NaN" for $x_2$ because the problem I talked above. I did what I said. I gave truncated Gumbel distribution as a proposal distribution for $x_1$ and run Gamma distribution for full conditional of $x_2$.

My initial values are 1 for each of $x_1$ and $x_2$. I am calculating acceptance ratio for $x_1$ since I run Metropolis-Hastings on it. I update each $x_1$ for each iteration in the parameters of Gamma distribution for $x_2$.

M=1000;
x=zeros(M+1,2);
t=zeros(M,1); %I create a Mx1 vector for inversing the transformation of x_2
alpha=2; % arbitrarily picked alpha value for my function
mu=0; %the shape parameter of truncated Gumbel distribution
beta=1; %the scale parameter of truncated Gumbel distribution

fx=@(x1,x2)((x1/x2)*((alpha/x2)^(x1-1))*\exp(-1*((alpha/x2)^x1))); %the function I have
gx=@(x1)(((1/beta)*\exp(-(((x1-mu)/beta)+\exp(-((x1-mu)/beta)))))/(1-\exp(-\exp(mu/beta)))); %the pdf of truncated Gumbell distribution

alpha21=@(x1)((-1/x1)+1); %the shape parameter of Gamma distribution for x2
beta21=@(x1)(alpha^(x1)); %the scale parameter of Gamma distribution for x2

acceptance_ratio=0;

t(1,1)=1;
x(1,1)=1;
x(1,2)=1;

for i=2:(M+1)
        old=fx(x(i-1,1),x(i-1,2));
        oldprob=gx(x(i-1,1)); %finding probabilities of x1's from truncated Gumbel distribution
        ex=x(i-1,1);

        z=unifrnd(0,1);
        t=(-beta)*log(-log(z*((1-\exp(-\exp(mu/beta))))+\exp(-\exp(mu/beta))))+mu; %inverse of cdf of truncated Gumbel distribution

        x(i,1)=t;
        new=fx(x(i,1),x(i-1,2));
        newprob=gx(x(i,1));
        acceptance_prob=min(1,new*newprob/(old*oldprob));
        if unifrnd(0,1)>acceptance_prob
            x(i,1)=ex;

        else
            acceptance_ratio=acceptance_ratio+1;
        end

        t(i,1)=gamrnd(alpha21(x(i,1)),beta21(x(i,1))); %updating the parameters of Gamma distribution in each iteration
        x(i,2)= t(i,1)^(-1/x(i,1));
end
mean(x)
var(x)
AR=acceptance_ratio/M

I think I am right at what I am doing. I know my problem. But I am stuck at it. I cannot find any solution. Would you care to help me?

Best Answer

Are you sure the joint density$$f(x_1,x_2)=\left(\dfrac{x_1}{x_2}\right)\left(\dfrac{\alpha}{x_2}\right)^{x_1-1}\exp\left\{-\left(\dfrac{\alpha}{x_2}\right)^{x_1} > \right\}\mathbb{I}_{\mathbb{R}^*_+}(x_1,x_2)$$ is integrable?


When I consider the conditional$$f(x_2|x_1)=\dfrac{1}{{x_2}^{x_1}}\exp\left\{-\dfrac{\gamma}{{x_2}^{x_1}} \right\}$$ it should be a proper probability density if the joint above is a proper density. However, when considering the case $x_1=1$, it simplifies to $$f(x_2|x_1=1)=\dfrac{1}{{x_2}}\exp\left\{-\dfrac{\gamma}{{x_2}} \right\}$$ which does not integrate over $(0,\infty)$ since the change of variable $\eta=1/x_2$ leads to the density $$f(\eta|x_1=1)=\eta\exp\left\{-\gamma\eta\right\}\eta^{-2}.$$ Which is equivalent to $\eta^{-1}$ when $\eta\approx 0$. Your change of variable $t=x_2^{-x_1}$ shows the same issue occurs when $x_1\le 1$.

Note: When you operate the change of variable $$x_1\longrightarrow\left(\beta \right)^{x_1}=y$$ the new variate $y$ is either supported by $(0,1)$ or $(1,\infty)$ depending on whether or not $\beta<1$. But this is irrelevant relative to the main issue.