My aim relates to the project described here, but I've tried to make this question self-contained.
I'm trying to write the MCMC pseudocode for the following inference problem:
Given two observed independent outcomes $X$ and $Y$, what are the
posterior densities of probabilities $p_{x}$ and $p_{y}$ of obtaining $X$ and $Y$, assuming $X$ ~
$\text {Bin}(N, p_{x})$ and $Y$ ~ $\text{Bin}(N, p_{y})$? We do not
know or observe $N$. We have priors for $N$, $p_{x}$, and $p_{y}$. We will assume
that $N$ ~ $\text{Poisson}(\lambda)$, and
$p_{x}$ and $p_{y}$ will have either $\text{Beta}$ or $\text{Uniform}$
priors, chosen by the user. For the $\text{Beta}$ priors, the mean and
variance will be specified. For the $\text{Uniform}$ priors, the upper
and lower limits will be specified.
My background is obviously not statistics, and I'm having a hard time patching gaps in my knowledge from the voluminous MCMC literature. Rather than burden commenters with explaining a mountain of Bayesian inference to me, I'd appreciate it if people could point out the errors in my pseudocode and understanding, including instances in which I might be misusing terms. I'll focus my study on those areas. I welcome suggested reading that is specific (I've seen but not completely read some of the suggestions here).
I will be coding this by hand from scratch, except for simple maths functions.
Ultimately, I'll make sure there's complete and accurate pseudocode associated with this post.
Proposed pseudocode
- Declare an initial value of $\theta_{0} = (N,p_{x},p_{y})$ and set equal to $\theta_{\text{current}}$.
Calculate the posterior probability of $\theta_{\text{current}}$:
First, calculate the log-likelihood $\textit{L}_\text{current}$ of $\theta_{\text{current}}$. Here, the log-likelihood
$\textit{L}_\text{current}$ = $L_{x} + L_{y}$, where $L_{x}$ $=$
$\text{log}{N \choose X} + X$ $\text{log } p_{x} + (N-X) \text{ log }$
$(1-p_{x})$ and $L_{y}$ is analogous.Next, obtain the posterior by multiplying the likelihood $\exp\textit{L}_\text{current}$ by the prior,
$\text{Prior}(\theta_\text{current})$. The prior of
$\theta_\text{current}$ is the product of the priors of each of the
parameters in $\theta$. In this case, if the user has chosen
$\text{Beta}$ priors for $p_{x}$ and $p_{y}$, the respective priors
are simply the $\text{Beta}$ pdf evaluated at $p_{x}$ or $p_{y}$. The
prior for $N$ is the pmf of the Poisson function at $N$.- Draw $\theta_{\text{candidate}}$ from the proposal distribution. What exactly is this distribution for each parameter in $\theta$? I've seen normal distributions used, but here $N$ and $p_{x}$ clearly
must be drawn from different distributions.- Calculate the posterior of $\theta_{\text{candidate}}$ following step 2.
- Evaluate whether to accept $\theta_{\text{candidate}}$ by comparing the posteriors, and update $\theta_{\text{current}}$ if
justified. I will use the Metropolis algorithm (not shown for
brevity).- Repeat steps 3-5 until convergence or reasonable stopping point.
(tl;dr) Major questions
-
In step 2, am I missing shortcuts related to the conjugate priors? I still don't really understand what they're good for or if I'm calculating the prior correctly.
-
How are jump distributions determined?
Thanks in advance.
Best Answer
If you're using Metropolis Hastings, then you don't have to worry too much about conjugacy of priors or finely-tuned proposal jumps.
Conjugate priors are useful for avoiding or short-circuiting numerical computation. They shouldn't be necessary for running Metropolis Hastings on a simple model.
For the proposal distribution, just try a random walk. If a proposal is outside the support of $\theta$, then reject it. (Think of it as having a prior value of zero.)
Also, there's a minor typo I think: you add the log prior to the log likelihood, or multiply the prior by the likelihood.