Solved – Practical implementation of posterior predictive distribution

bayesianmarkov-chain-montecarloposteriorsampling

I'm trying to compute the posterior predictive distribution for a problem from samples of a Metropolis Hastings run.
So as I understand the formula is
\begin{equation} p(y^{new} | y) = \int_u p_{lik}(y^{new} | u) p_{post}(u | y) du, \end{equation}
where $p_{post}$ is the posterior p.d.f and $p_{lik}$ the likelihood.

My first question is, do I really need the actual p.d.f of the posterior here? In my problem it is not analytically known, i.e. I only have the proportionality formula
$ p_{post}(u | y) \propto p_{lik}(y | u) p_{prior}(u)$. That is, I only have the unnormalized posterior. Can I just plug this into the above formula for the predictive?

Also, I'm thinking that, given the MCMC samples, lets say $(u_i), i=1, \ldots, n$, the integral should somehow be solved by Monte Carlo Integration,
i.e.
\begin{align} p(y^{new} | y) &= \int_u p_{lik}(y^{new} | u) p_{post}(u | y) du \\
&= \frac{1}{n} \sum_i^n p_{lik}(y^{new} | u_i) p_{post}(u_i | y) \end{align}
Is that right? It seems somehow redundant that I already have samples form the posterior, $u_i$, and I evaluate the posterior p.d.f again…

And, finally, I'm wondering what exactly $y^{new}$ is. My understanding is that I chose a set of $y^{new}_i$ within some reasonable interval where I would expect new data points to lie.. then for each $i$ compute $p(y_i^{new} | y)$, which results in a number that tells how likely $y_i^{new}$ is. I'm not sure if I understood the concept of 'sampling from the predictive distribution' correctly, though…

Best Answer

You are actually only one concept away from understanding it.

The samples from the M-H run are, ideally, random draws from the posterior distribution. Consequently, they can be used in place of (as an approximation to) the actual posterior distribution for purposes such as integration (in this case known as Monte Carlo integration.) See this, courtesy of this post, for an explanation.

Given that they are random draws from the posterior, you don't need $p_{post}(u_i|y)$ in your calculations; the probability of each of the $u_i$ you've drawn is now $1/n$, where $n$ is the number of M-H samples. $p(y^{new}|y)$ is approximated by $1/n \sum p(y^{new}|u_i)$. $p(u_i|y)$ was implicitly taken into account when drawing the M-H samples, so does not need to be included explicitly in the calculation of $p(y^{new}|y)$.

$y^{new}$ is just whatever value or values you want to calculate a posterior predictive probability for, nothing more than that.

Related Question