Bayesian – Can the Proposal Distribution Be the Prior in Rejection Sampling?

bayesianmarkov-chain-montecarlomarkov-processrejection samplingsampling

Suppose I have a target distribution $\pi(\theta|x) \propto P(x|\theta)P_{\theta}(\theta)$ (e.g. the unnormalized posterior). I would like to use rejection sampling to obtain many samples $\{\theta_i\}$ from the posterior. Rejection sampling requires the use of a proposal distribution, $g(\theta)$, such that $c \cdot g(\theta) \geq \pi(\theta|x), \ \forall \ \theta$ (so $c\cdot g(\theta)$ envelopes the unnormalized posterior).

My question is: Can I get samples from $\pi(\theta|x)$ by setting $g(\theta) = P_{\theta}(\theta)$? In other words, can my proposal distribution be my prior (times a constant), which I then compare to the likelihood $P(x|\theta)$? e.g. my proposed algorithm is as follows:

  1. Sample $\theta_i$ from $P_\theta(\theta)$
  2. Sample $U_i$ from $Unif(0,1)$
  3. If $U_i \cdot (C \cdot P_\theta(\theta_i)) \leq P(x|\theta_i)$, accept $\theta_i$. Otherwise reject.
  4. Repeat many times

Will the resulting samples follow the posterior $\pi(\theta|x)$? Thank you!

Best Answer

I will use the alternative notation $\pi_0$ for the prior, since this is less confusing.

If you want to obtain samples from the posterior then you need to set this as your target distribution, not the sampling density. This requires that there exists a value $C \in \mathbb{R}$ such that $\pi(\theta|x) \leqslant C \pi_0(\theta)$ for all $\theta \in \Theta$. Assuming this holds, your acceptance indicator should be:

$$\mathscr{A}_i(\theta) \equiv \mathbb{I} \Bigg( U_i \leqslant \frac{\pi(\theta|x)}{C \cdot \pi_0(\theta)} \Bigg).$$

Using this acceptance event we get the conditional distribution:

$$\begin{align} p(\theta_i | \mathscr{A}_i) &= \frac{p(\theta_i, \mathscr{A}_i(\theta_i)=1)}{\mathbb{P}(\mathscr{A}_i(\theta_i)=1)} \\[6pt] &= \frac{\mathbb{P}(U_i \cdot C \cdot \pi_0(\theta_i) \leqslant \pi(\theta_i|x)) \cdot \pi_0(\theta_i)}{\int \mathbb{P}(U_i \cdot C \cdot \pi_0(\theta) \leqslant \pi(\theta|x)) \cdot \pi_0(\theta) \ d\theta } \\[6pt] &= \frac{\pi(\theta_i|x)/ C}{\int \pi(\theta|x)/C \ d\theta } \\[6pt] &= \frac{\pi(\theta_i|x)}{\int \pi(\theta|x) \ d\theta } \\[6pt] &= \pi(\theta_i|x), \\[6pt] \end{align}$$

which establishes the desired target distribution. Now, so long as the required bound shown above holds, it is possible to perform rejection sampling using this method to get posterior samples. However, even in this case, crude rejection sampling is quite inefficient compared to standard MCMC methods (e.g., Metropolis-Hastings, etc.).

Related Question