Solved – Posterior Predictive Distribution as Expectation of Likelihood

bayesianconditional probabilityposteriorprobability

Say we have a posterior predictive density:

$$p(\tilde{y}|\mathbf{y}) = \int p(\tilde{y}|\theta)p(\theta|\mathbf{y})d\theta$$

In Hoff's Bayesian Statistical Methods text, he suggests that to obtain an approximation of $p(\tilde{y}|\mathbf{y})$ by sampling from posterior distribution, and computing $\frac{1}{S}\sum_{s=1}^Sp(\tilde{y}|\theta^{(s)})$.

He justifies this by stating $p(\tilde{y} | \mathbf{y})$ is the posterior expectation of $p(\tilde{y}|\theta)$, but I actually can't see the equivalence. How does one derive $p(\tilde{y} | \mathbf{y})$ from $p(\tilde{y}|\theta)$?

Best Answer

$\newcommand{\y}{\mathbf y}$We have $$ E_{\theta|\y}\left[f(\theta)\right] = \int f(\theta) p(\theta | \y)\,\text d\theta $$ just by definition of expectation (and you could cite the LOTUS as well), and since $p(\theta|\y)$ is the posterior density this is the posterior expectation of $f(\theta)$. Now choose $$ f(\theta) = p(\tilde y | \theta) $$ and then $$ E_{\theta|\y}\left[p(\tilde y | \theta)\right] = \int p(\tilde y | \theta) p(\theta | \y)\,\text d\theta. $$

I'm not sure if you also are wondering about the justification of this integral in the first place, but typically the data are assumed independent given the generating parameters so for a new point $\tilde y$ you'd have $\tilde y \perp \y | \theta$ which means $$ p(\tilde y | \y) = \int p(\tilde y , \theta | \y)\,\text d\theta \\ = \int p(\tilde y | \theta , \y) p(\theta | \y)\,\text d\theta \\ = \int p(\tilde y | \theta) p(\theta | \y)\,\text d\theta \\ = E_{\theta|\y}\left[p(\tilde y | \theta)\right] $$

so you can use the law of large numbers with posterior samples to produce an estimator of this density.

Related Question