Solved – Sampling from Gamma Distribution using the Rejection Method

accept-rejectgamma distributionrejection sampling

I'm having some issues working through this practice problem. I have worked through the first portion of it, and I have the solution, but I don't understand how/why the solution does two things at the end.

Problem:

Use the rejection method to sample from the gamma density $\Gamma(\lambda, t)$ where $t (\geq 1)$ may not be assumed integral. [Hint: You might want to start with an exponential random variable with parameter $\frac{1}{t}$.]

What I've done

Let $Z\sim \Gamma(1, t)$. This can be done without loss of generality because, for such a Z, $\frac{Z}{\lambda}\sim \Gamma(\lambda, t)$.

Let $V\sim Uniform(0,1)$. Using the hint, for some $y\in (0,1)$,

\begin{align*}
F(x) &= 1-e^{\frac{-x}{t}}\\
y &= 1-e^{\frac{-x}{t}}\\
x &= F^{-1}(y) = -t ln(1-y)\\
X &= -t~ln(1-V)\\
X &= -t~ln(V)
\end{align*}

(Since if $1-V\sim Uniform(0,1)$, then $V\sim Uniform(0,1)$.)

Now, we need to find some $c$ such that

\begin{align*}
f_Z(x) \leq c f_X(x)
\end{align*}

where $f_Z(x) = \frac{1}{\Gamma(t)}x^{t-1}e^{-x}$ and $f_X(x)= \frac{1}{t} e^{\frac{-x}{t}}$.

Thus, we need to find a $c$ that satisfies:

\begin{align*}
\frac{f_Z(x)}{f_X(x)} &\leq c\\
\frac{\frac{1}{\Gamma(t)}x^{t-1}e^{-x}}{\frac{1}{t} e^{\frac{-x}{t}}} &\leq c\\
\frac{t e^{x(1/(t-1))}x^{t-1}}{\Gamma(t)} &\leq c
\end{align*}

By taking the derivative w.r.t. X, I verified that, for $t\geq 1$ and $x>0$, the LHS is maximized by $x=t$ (I'll omit this work, but I found that the derivative had a root at $x=-t^2+2t-1$, so over the ranges of $t$ and $x$, $t>x$, thus the LHS is strictly smaller than its value when $x=t$).

Thus, the smallest value of $c$ such that $LHS\leq RHS$ is:

\begin{align*}
c &= \frac{1}{\Gamma(t)}t^{t} e^{-(t-1)}
\end{align*}

What I don't understand

The solution uses the following two things to continue the problem that I don't understand.

(1)

"Conditional on the event $A$:

\begin{align*}
U &\leq \frac{X^{t-1}e^{-t}}{\Gamma(t)}t~e^{\frac{-X}{t}}
\end{align*}

where $U\sim Uniform(0,1)$, X has the required gamma distribution."

I don't understand where this definition of $A$ comes from.

I can see that $\frac{X^{t-1}e^{-t}}{\Gamma(t)}$ is almost $f_Z(x)$, except that $e^{-x}$ has been replaced by $e^{-t}$ and the rest is almost $f_X(x)$, except $\frac{1}{t}$ has been replaced with $t$.

But, even if that were the case, $A$ would be showing that $U\leq$ the joint density of the two component functions (and, again, I know that this is not the case, but it is the closest explanation I can come up with), not the $\frac{f_Z(x)}{c~f_X(x)}$ that I believe I need to continue the problem. I cannot manipulate $\frac{f_Z(x)}{c~f_X(x)}$ to create this $A$.

I understand that the next step likely involves the fact that $F_Z(z;1, t)$ will take the form $\frac{\gamma(t, z)}{\Gamma(t)}$, and $\gamma(t, z) = \int_0^z x^{t-1}e^{-x} dx$. I just don't know what to do with this information, given that I don't know where this $A$ came from or how to use it.

(2)

"We note that $A = \{log(U)\leq(n-1)(log(X/n)-(X/n)+1) \}$."

Where did this come from and why is it worth noting?

Best Answer

The rejection method can be explained either by an envelope argument, namely that pairs $(X,U)$ that are generated uniformly on the set $$\mathfrak S_{cf_X}=\{(x,u);\ 0\le u\le c f_X(x)\}$$ [called the subgraph of $cf_X$] are such that

$X\sim f_X(x)$ (we call this result the fundamental lemma of simulation in our book);
if $U<f_Z(X)$ then $X\sim f_Z(x)$

This can be seen on the picture below where the hollow circles are $(X,U)$'s that are generated uniformly over a square that contains the graph of the target $f_Z$ (a Gamma truncated to (0,1) in this illustration) and the filled circles are those that fall below $f_Z$. They are indeed uniformly distributed over $\mathfrak S_{f_Z}$.

It can also be explained by a stopping rule reasoning: if one runs a sequence of iid simulations from $f_X$, $X_1,X_2,\ldots$ and an associated auxiliary sequence of uniforms $U_1,U_2,\ldots$ until $$U_i<f_Z(X_i)/cf_X(X_i)$$ happening for the (random) index value $N$, then $$X_N \sim f_Z(x)$$ since the density of $X_N$ is proportional to$$f_X(x)\times\mathbb P(U\le f_Z(X)/cf_X(X)\mid X=x) \propto f_Z(x)$$

Note: I do not understand the part of the question that mentions "The solution" as I cannot tell where this solution is coming from.

Related Solutions

Solved – Accept-reject algorithm for Beta(1,$\beta$)

I will try to explain how rejection sampling works "in general". If I am clear enough, you should then be able to answer your own question. I won't give proofs but intuitive (or I hope so) "facts".

Fact 1 If $X$ is drawn in a distribution of density $g(x)$ and $U$ in a uniform $U(0,1)$, then the point $(X, U\times M\times g(X))$ is drawn uniformly in the subgraph of $M g(x)$. Illustration with the normal density and $M = 5$:

> x <- rnorm(1e4)
> u <- runif(1e4)
> y <- u*5*dnorm(x)
> plot(x, y, pch=".")

plot produced by above R commands

This fact should be intuitive from the definition of the density: the height $g(x)$ of the subgraph above $x$ is the "probability" to fall in $x$.

Fact 2 If you can draw uniformly points $(X,Y)$ in the subgraph of $f(x)$ (a density function), then $X$ is distributed with density $f(x)$.

Fact 3 If $f(x) \le M g(x)$ and you draw points uniformly in the subgraph of $M g(x)$, the points which fall in the subgraph of $f(x)$ are uniformly distributed in it.

Illustration with the tent function $f(x) = \begin{cases} 0 & \text{if } |x| \ge 1 \\ x+1 & \text{if } -1 < x \le 0 \\ 1-x & \text{if } 0 < x < 1 \end{cases}$

 > f <- function(x) ifelse(-1<x & x<=0, x+1, ifelse(0<x & x<1, 1-x, 0))
 > plot(x, y, pch="x", col=ifelse(y < f(x), "black", "red") )

cf R code

All these facts together implies that if you keep only the points $X$ such that $U \times M \times G(X) \le f(X)$, these points are distributed with density $f$.

> x1 <- x[ y < f(x) ]
> length(x1)
[1] 2068
> hist(x1)

histogram

Of course this example is only intended for illustration, the proposed algorithm would be a bad way to sample in the tent distribution.

Note that one nice feature of importance sampling, that you can deduce from the above argumentation, is that if you know $f(x)$ only up to some unknown multiplicative constant, you can use the algorithm as well.

Solved – Convergence from Gamma to Normal Distribution

This problem is a direct application of Slutsky's theorem:

If $X_n$ converges in distribution to a random element $X$ and if $Y_n$ converges in probability to a constant $c$, then $X_n/Y_n$ converges in distribution to $X/c$.

In your setting, the theorem numerator "$X_n$" is replaced with $\sqrt{n}(\bar{X}_n-\mu)$, since, by virtue of the Central Limit Theorem, $$\sqrt{n}(\bar{X}_n-\mu)\stackrel{\text{dist}}{\stackrel{n\to\infty}{\longrightarrow}}\mathcal{N}(0,\text{var}(X_i))$$ and here $\text{var}(X_i)=\mu$, hence $$\sqrt{n}(\bar{X}_n-\mu)\stackrel{\text{dist}}{\stackrel{n\to\infty}{\longrightarrow}}\mathcal{N}(0,\mu)\,.$$ And the theorem denominator "$Y_n$" is replaced with $\bar{X}_n$ which converges in probability to its expectation $\mu$ by the weak law of large numbers. Hence, $$\dfrac{\sqrt{n}(\bar{X}_n-\mu)}{\bar{X}_n}\stackrel{\text{dist}}{\stackrel{n\to\infty}{\longrightarrow}}\mathcal{N}(0/\mu,\mu/\mu^2) $$i.e. $$\dfrac{\sqrt{n}(\bar{X}_n-\mu)}{\bar{X}_n}\stackrel{\text{dist}}{\stackrel{n\to\infty}{\longrightarrow}}\mathcal{N}(0,1/\mu)$$ Q.E.D.