Poisson-gamma mixture when likelihood terms are raised to powers

bayesiangamma distributionpoisson distributionprobabilitystatistics

Consider a Poisson model with gamma prior:

$$
x \sim \text{Poisson}(\theta), \quad \theta \sim \text{gamma}(\alpha, \beta).
$$

I am interested in deriving the posterior predictive for this model with a power $\zeta_i$ raised to each term in the likelihood:

$$
\mathcal{L}(\theta) \propto p(\theta) \prod_{i=1}^t p_{\theta}(x)^{\zeta_i}
$$

Without these powers, it's straightforward to show that the posterior predictive is negative binomially distributed, e.g.

I'm fairly certain based on prior work (e.g. see equation 2.4 here: https://jwmi.github.io/publications/C-posterior.pdf) that this power-based likelihood will still be tractable. However, I can't push the powers through the derivation cleanly.

Here's what I have. The posterior is tractable:

$$
\begin{aligned}
p(\theta \mid X_{1:t})
&\propto \pi(\theta) \prod_{i=1}^t p_{\theta}(x)^{\zeta_i}
\\
&= \frac{\beta^{\alpha}}{\Gamma(\alpha)} \theta^{\alpha-1} e^{-\beta\theta} \prod_{i=1}^t \left(\frac{\theta^{x_i} e^{-\theta}}{x_i!} \right)^{\zeta_i}
\\
&\propto \theta^{\alpha-1} e^{-\beta\theta} \prod_{i=1}^t \left( \theta^{x_i} e^{-\theta} \right)^{\zeta_i}
\\
&\propto \theta^{\sum_i \zeta_i x_i + \alpha-1} e^{-(\sum_i \zeta_i + \beta) \theta}
\end{aligned}
$$

It's just gamma distributed:

$$
\pi(\theta \mid X_{1:t}) = \text{gamma}\!\left(\sum_{i=1}^t \zeta_i x_i + \alpha, \sum_{i=1}^t \zeta_i + \beta \right).
$$

The posterior predictive is therefore

$$
\begin{aligned}
p(x_{t+1} \mid X_{1:t})
&= \int_{\Theta} p(x_{t+1} \mid \theta) p(\theta \mid X_{1:t}) \text{d}\theta
\\
&= \int_{\Theta} \left[\frac{\theta^{x_{t+1}} e^{-\theta}}{x_{t+1}!}\right]^{\zeta_{t+1}} \left[ \frac{(\sum_i \zeta_i + \beta)^{\sum_i \zeta_i x_i + \alpha}}{\Gamma(\sum_i \zeta_i x_i + \alpha)} \theta^{\sum_i \zeta_i x_i + \alpha – 1} e^{-(\sum_i \zeta_i + \beta) \theta} \right]\text{d}\theta
\\
&= \frac{(\sum_i \zeta_i + \beta)^{\sum_i \zeta_i x_i + \alpha}}{\Gamma(\sum_i \zeta_i x_i + \alpha)(x_{t+1}!)^{\zeta_{t+1}}} \int_{\Theta} \theta^{\zeta_{t+1} x_{t+1} + \sum_i \zeta_i x_i + \alpha – 1} e^{-(\zeta_{t+1} + \sum_i \zeta_i + \beta) \theta} \text{d}\theta
\\
&= \frac{(\sum_i \zeta_i + \beta)^{\sum_i \zeta_i x_i + \alpha}}{\Gamma\left(\sum_i \zeta_i x_i + \alpha\right) (x_{t+1}!)^{\zeta_{t+1}}} \frac{\Gamma(\zeta_{t+1} x_{t+1} + \sum_i \zeta_i x_i + \alpha)}{\left(\zeta_{t+1} + \sum_i \zeta_i + \beta\right)^{\zeta_{t+1} x_{t+1} + \sum_i \zeta_i x_i + \alpha}}
\\
&= \frac{\Gamma(\zeta_{t+1} x_{t+1} + \sum_i \zeta_i x_i + \alpha)}{\Gamma\left(\sum_i \zeta_i x_i + \alpha\right) (x_{t+1}!)^{\zeta_{t+1}}} \left(\frac{\sum_i \zeta_i + \beta}{\zeta_{t+1} + \sum_i \zeta_i + \beta} \right)^{\sum_i \zeta_i x_i + \alpha} \left(\frac{1}{\zeta_{t+1} + \sum_i \zeta_i + \beta} \right)^{\zeta_{t+1} x_{t+1}}
\end{aligned}
$$

This is almost negative binomally distributed. However, I don't know what to do with $(x_{t+1}!)^{\zeta_{t+1}}$. It seems like with a nice trick, maybe a property of the gamma distribution raised to a power, this can be resolved to induce a negative binomial distribution with parameters that are functions of $x_{t+1}$, $\zeta_{t+1}$, and the posterior parameters.

Best Answer

I want to clean up the notation. Let the posterior gamma hyperparameters be $$\alpha' = \sum_{i=1}^t \zeta_i x_i + \alpha, \quad \beta' = \sum_{i=1}^t \zeta_i + \beta.$$ And for convenience, let us write $y = x_{t+1}$ and $\xi = \zeta_{t+1}$. Then your question basically boils down to evaluating $$\begin{align} p(y \mid x_{1:t}, \xi) &\propto \int_{\theta=0}^\infty \left(\frac{e^{-\theta} \theta^y}{y!}\right)^\xi \frac{(\beta')^{\alpha'} \theta^{\alpha'-1} e^{-\beta' \theta}}{\Gamma(\alpha')} d\theta \\ &= \frac{(\beta')^{\alpha'}}{(y!)^\xi \Gamma(\alpha')} \int_{\theta=0}^\infty \theta^{y\xi + \alpha' - 1} e^{-(\xi + \beta')\theta} \, d\theta \\ &= \frac{(\beta')^{\alpha'}}{(y!)^\xi \Gamma(\alpha')} \cdot \frac{\Gamma(y\xi + \alpha')}{(\xi + \beta')^{y \xi + \alpha'}}. \end{align}$$ This is equivalent to what you obtained, just written more compactly. Note that this posterior predictive likelihood is not in general a density unless $\xi = 1$. Hence we are motivated to consider the kernel of the posterior predictive likelihood: $$\kappa(y \mid x_{1:t}, \xi) = \frac{\Gamma(y\xi + \alpha')}{\Gamma(y+1)^\xi (\xi + \beta')^{\xi y}}.$$ It is at this point it becomes very plainly obvious that there is no way for the posterior predictive to be negative binomial, whose kernel is $$\kappa(y) = \frac{\Gamma(y+r)}{\Gamma(y+1)} p^y,$$ unless $\xi = 1$. A generalization could be considered for positive integer $\xi$, such that the ratio $$\frac{\Gamma(y\xi + \alpha')}{\Gamma(y+1)^\xi} \propto \binom{y\xi + \alpha' - 1}{y, y, \ldots, y}$$ is proportional to a $\xi+1$-nomial coefficient, but to my knowledge, there is no name for such a distribution. And if $\xi$ is not an integer, I should think there is no hope for a simple model; the integrating factor to make it a density is likely to not have a closed form.