Solved – Zero-inflated Poisson and Gibbs sampling, proofs and sampling

bernoulli-distributiongamma distributiongibbspoisson distributionr

I am trying to figure out zip-inflated Poisson (ZIP) model. In this model, random data $X_1, .., X_n$ are of the form $X_i=R_iY_i$, where the $Y_i$'s have Poisson distribution ($\lambda$) and the $R_i$'s have Bernoulli distribution ($p$), and all independent from each other. If given an outcome $x = (x_1, .., x_n)$, the objective is to estimate both $\lambda$ and $p$.

We can use a hierarchical Bayes model:

$p$ ~ Uniform(0,1) (prior for $p$),
$(\lambda|p)$ ~ Gamma(a,b) (prior for $\lambda$),
$(r_i|p, \lambda )$ ~ Bernoulli($p$) independently (from the model above),
$(x_i|r, \lambda, p )$ ~ Poisson($\lambda r_i$) independently (from the model above)

Since $a$ and $b$ are known parameters, and $r = (r_1,…,r_n)$, it follows that

$ f(x,r, \lambda, p) = \frac{b^\alpha \lambda^{\alpha-1} e^{-b \lambda}}{\Gamma(\alpha)} \prod_{i=1}^n\frac{e^{-\lambda r_i} (\lambda r_i)^{x_i}}{x_i!} p^{r_i}(1-p)^{1-r_i}$

Eventually, I want to use Gibbs sampling to sample from the posterior pdf $f(\lambda,p,r|x)$.

My first question is: How could I generate a random sample (say n=100) for the ZIP model using $p=0.3$ and $\lambda = 2$. I would like to use R statistical language. I need help with at least the pseudocode on how to accomplish this.

My second question is:

How can we really know (and show):

1) $\lambda|p,r,x\sim Gamma(a+ \sum_{i}x_i, b+ \sum_{i}r_i)$
2) $p|\lambda,r,x\sim Beta(1+ \sum_{i}r_i, n+1 – \sum_{i}r_i)$
3) $r_i|\lambda,p,x \sim Bernoulli(\frac{pe^{- \lambda}}{pe^{- \lambda}+(1-p)I{\{x_i=0}\}})$

Best Answer

For the first question, given that you have a hierarchical model, you simply have to start from the higher level and proceed downwards:

  1. $(r_i|p, \lambda )\sim\mathcal{B}(p)$ for $i=1,\ldots,n$
  2. $(x_i|r, \lambda, p)\sim\mathcal{P}(\lambda r_i)$ for $i=1,\ldots,n$

Note that only the $x_i$'s for which $r_i=1$ need to be simulated. As an R code, this can be written as

n=10^2
p=0.3
lam=2
r=as.integer(runif(n)<p)
x=rep(0,n)
x[r==1]=rpois(sum(r==1),lam)

For the second question, the full conditional distributions can be extracted from the joint density$$f(x,r, \lambda, p) = \frac{b^\alpha \lambda^{\alpha-1} e^{-b \lambda}}{\Gamma(\alpha)} \prod_{i=1}^n\frac{e^{-\lambda r_i} (\lambda r_i)^{x_i}}{x_i!} p^{r_i}(1-p)^{1-r_i}$$since the full conditional densities all are proportional to this joint density as functions of $\lambda$, $p$, or the $r_i$'s.

For instance, $$\pi(\lambda|p,\mathbf{r},\mathbf{x})\propto\lambda^{\alpha-1} e^{-b \lambda}\prod_{i=1}^n e^{-\lambda r_i} (\lambda)^{x_i}\propto\lambda^{\alpha-1} e^{-b \lambda} e^{-\lambda \sum_i r_i} \lambda^{\sum_i x_i}$$ where I only kept the terms that depend on $\lambda$. Hence $$\pi(\lambda|p,\mathbf{r},\mathbf{x})\propto\lambda^{\alpha-1+\sum_i x_i} e^{-\left[b+\sum_i r_i\right] \lambda}$$ which is proportional to the $$Gamma\left(a+ \sum_{i}x_i, b+ \sum_{i}r_i\right)$$density on $\lambda$.

Similarly$$\pi(p|\lambda,\mathbf{r},\mathbf{x}) \propto \prod_{i=1}^n p^{r_i}(1-p)^{1-r_i}=p^{\sum_i r_i}(1-p)^{n-\sum_i r_i}$$ leading to $$Beta\left(1+ \sum_{i}r_i, n+1 - \sum_{i}r_i\right)$$as the full conditional posterior on $p$.

Related Question