Deriving the expression for the posterior predictive distribution.

We have $Y \mid \Theta=\theta \sim \operatorname{Po}(\theta)$ and $\Theta \sim \operatorname{Gamma}\left(\alpha_0, \lambda_0\right)$, the expression for the posterior predictive distribution is acquired by the following,

$\begin{aligned} f_{Y \mid \boldsymbol{X}}(y \mid \boldsymbol{x}) & =\int_0^{+\infty} f_{Y \mid \Theta}(y \mid \theta) f_{\Theta \mid \boldsymbol{X}}(\theta \mid \boldsymbol{x}) d \theta \\ & =\int_0^{+\infty} \frac{\theta^y}{y !} e^{-\theta} \cdot \frac{\lambda_n^{\alpha_n}}{\Gamma\left(\alpha_n\right)} \theta^{\alpha_n-1} e^{-\lambda_n \theta} d \theta \\ & =\frac{\lambda_n^{\alpha_n}}{\Gamma\left(\alpha_n\right)} \frac{1}{y !} \int_0^{+\infty} \theta^{\alpha_n+y-1} e^{-\left(\lambda_n+1\right) \theta} d \theta \\ & \propto \frac{1}{y !} \int_0^{+\infty} \theta^{\alpha_n+y-1} e^{-\left(\lambda_n+1\right) \theta} d \theta \\ & =\frac{1}{y !} \frac{\Gamma\left(\alpha_n+y\right)}{\left(\lambda_n+1\right)^{\alpha_n+y}} \\ & \propto \frac{\Gamma\left(\alpha_n+y\right)}{y !}\left(\lambda_n+1\right)^y\end{aligned}$

$\begin{aligned} f_{Y \mid \boldsymbol{X}}(y \mid \boldsymbol{x}) & \propto \frac{\left(\alpha_n+y-1\right) !}{y !}\left(\frac{1}{\lambda_n+1}\right)^{\alpha_n+y} \\ & \propto \frac{\left(\alpha_n+y-1\right) !}{\left(\alpha_n-1\right) ! y !}\left(\frac{\lambda_n}{\lambda_n+1}\right)^{\alpha_n}\left(\frac{1}{\lambda_n+1}\right)^y .\end{aligned}$

I understand the first couple of steps where we use the fact that the posterior distribution ($\Theta \sim \operatorname{Gamma}\left(\alpha_0, \lambda_0\right)$) integrates to $1$ and we just have to determine the integral up to a constant. The confusing step is where $\frac{1}{y !} \frac{\Gamma\left(\alpha_n+y\right)}{\left(\lambda_n+1\right)^{\alpha_n+y}}$ becomes $\frac{\left(\alpha_n+y-1\right) !}{\left(\alpha_n-1\right) ! y !}\left(\frac{\lambda_n}{\lambda_n+1}\right)^{\alpha_n}\left(\frac{1}{\lambda_n+1}\right)^y$. Isn't sufficient to have $\frac{1}{y !} \frac{\Gamma\left(\alpha_n+y\right)}{\left(\lambda_n+1\right)^{\alpha_n+y}}$, why is the latter preferred over the other?

Best Answer

The computation can be done directly:

\begin{align} \int_0^\infty \frac{\theta^y}{y!}e^{-\theta} \frac{\lambda^\alpha}{\Gamma(\alpha)}\theta^{\alpha-1}e^{-\lambda\theta}\ \mathsf d\theta &= \frac{\lambda^\alpha}{\Gamma(\alpha)y!}\int_0^\infty \theta^{\alpha+y-1}e^{-\lambda\theta}\ \mathsf d\theta\\ &= \frac{\lambda^\alpha \Gamma(y+\alpha)}{(1+\lambda)^{y+\alpha}\Gamma(\alpha)y!}\\ &= \frac{\Gamma(y+\alpha)}{y!\Gamma(\alpha)}\left(\frac{\lambda}{1+\lambda}\right)^\alpha\left(\frac{1}{1+\lambda}\right)^y, \end{align} and hence the posterior predictive distribution is negative binomial with parameters $r=\alpha$ and $p=\frac\lambda{1-\lambda}$.

Best Answer

Related Solutions

Gamma Distribution – Showing the Posterior Distribution

[Math] How to derive the posterior predictive distribution

Related Question