Solved – Marginalizing a Poisson-distributed count parameter in a Binomial Distribution

I'm trying to implement the following model in Stan:
$$\begin{align} \text{Pr}(y|n,p) & \sim \text{Binomial}(n,p)\\
\text{Pr}(n|\lambda) & \sim \text{Poisson}(\lambda)
\end{align}$$

In this model, $y$ and $n$ are non-negative integers, $0<p<1$, $\lambda > 0$, and $y \le n$. Because Stan does not allow for discrete parameters, I need to marginalize over $n$ (i.e., find the probability mass function of $y$ given $p$ and $\lambda$. The PMFs of $y$ and $n$ are as follows:

$$\begin{align} \text{Pr}(y|n,p) = & \frac{n!}{y!(n-y)!}p^{y}(1-p)^{n-y}\\
\text{Pr}(n|\lambda) =& \frac{\lambda^{n}e^{-\lambda}}{n!}\end{align}$$

My attempt to marginalize over $n$ is as follows:

$$\begin{alignat}{4} \text{Pr}(y|p,\lambda) & = \sum_{n=y}^{\infty}\text{Pr}(y|n,p)\text{Pr}(n|\lambda)\\\\
& = \sum_{n=y}^{\infty}\frac{\lambda^{n}e^{-\lambda}}{y!(n-y)!}p^{y}(1-p)^{n-y}\\\\
& = \frac{e^{-\lambda p}p^{y}\lambda^{y}}{\Gamma(y+1)} \end{alignat} $$

The final step was done by Wolfram Alpha, as my own mathematical skills are a bit rusty. Does this seem correct?

To implement this, I first took the log PMF:

$$\text{log-Pr}(y|p,\lambda)=-\lambda p+y(\text{log}p+\text{log}\lambda)-\text{log}\Gamma(y+1)$$

The Stan code for the function is:

  real binomial_poisson_lpmf(int[] y, vector p, vector lambda) { // vectorized;  assumes that y, p, and lambda are all of the same length.
    vector[size(y)] out;
    for(i in 1:size(y))
      out[i] = -lambda[i] * p[i] + y[i] * (log(p[i]) + log(lambda[i])) - lgamma(y[i]+1));
    return(sum(out));
  }

Anyway, I would appreciate any feedback on whether or not this seems valid; I would also like to know if anyone knows of any citations on the subject.

Thank you for your time.

Best Answer

$y \mid p,\lambda$ is Poisson!

Your marginalization, or at least the end result, is correct. The form you have obtained for the distribution is the probability mass function of a Poisson distribution -- just write $p^y\lambda^y$ as $(\lambda\,p)^y$ and behold. That is,

\begin{equation} y \mid p, \lambda \sim \mathrm{Poisson}(\lambda\,p). \end{equation}

You have essentially rediscovered the fact that a Poisson process thinned randomly (so that every point is selected with probability $p$ independent of others) is Poisson. This is a well-known result, a quick Google turned up [1] but I suppose this is in many textbooks. Your situation is analogous to this, since the Poisson distributed random variable $y$ can be interpreted as the number of arrivals in a Poisson process with intensity $\lambda$ in a unit interval. Conditional on $y$ the thinning operation selects each of the $y$ points independently with probability $p$, which is a binomial trial.

So, the marginalization could have been 'derived' without any algebraic manipulations by knowing the thinning-of-a-Poisson-process result and realizing how it applies here.

Note about Stan implementation

Note that this also means you do not have to write your own function in Stan since this is simply

y ~ poisson(lambda .* p).

Negative-binomial case

This extends to the NegBin-case (mentioned in comments), too, since a negative binomial can be represented as a mixture of Poissons where the parameter has a Gamma distribution. Conditional on the gamma-distributed parameter, $y$ is Poisson, too. And if $\lambda$ is gamma-distributed, $p\,\lambda$ is too (fixed $p$), so when marginalizing over the gamma-distributed parameter, $y$ is negative-binomial.

The negative-binomial has multiple parametrizations -- depending on the parametrization one has to work out how multiplying the Poisson rates by $p$ translates into the parameters of the negative-binomial. Left as an exercise to the reader.

Reference

[1] http://www.math.uah.edu/stat/poisson/Splitting.html -- Random (formerly Virtual Laboratories in Probability and Statistics), Section 13.5

Best Answer

$y \mid p,\lambda$ is Poisson!

Note about Stan implementation

Negative-binomial case

Reference

Related Solutions

Solved – the limiting distribution of exponential variates modulo 1

Related Question