[Math] Calculating variance of marginal distribution

probabilityprobability distributions

Suppose you have a random variable $Y \sim Po(\mu)$ where $\mu$ is given by $\mu \sim Ga(\alpha, \eta)$. Correct me if I'm wrong, but I think it is clear that the Poisson distribution is a conditional distribution $Y|\mu$ and the Gamma distribution is a conditional $\mu\mid\alpha, \beta$.

One way to calculate the variance of the marginal $Y$ is using the law of total variance:

$$Var(Y) = E(Var(Y\mid\mu)) + Var(E(Y\mid\mu)) = \xi + \xi^2/\alpha$$

where $\xi = \alpha / \eta$.

However, I was wondering if I could obtain the same result in the traditional way. I tried to marginalize the conditional distribution $Y\mid \mu$:

$$p(y|\alpha,\eta) = \int p(y, \mu\mid\alpha, \eta)\,d\mu = \int p(y\mid\mu)p(\mu\mid\alpha, \eta)\,d\mu$$

Calculating this integral (the product of the distribution $Po(\mu)$ and $Ga(\alpha, \eta)$ from 0 to infinity) in Mathematica, I get this expression:

$$p(y\mid\alpha,\eta) = \frac{\eta^{-\alpha}(\frac{1}{\eta}+1)^{-\alpha-y}\Gamma(y+\alpha)}{y!\Gamma(\alpha)}$$

Then I calculated the variance of $Y$:

$$Var(Y) = \sum_y (y-\mu)^2 p(y\mid \alpha,\eta)$$

$Var(Y)$ is calculated with $\mu$ as the expected value of the marginal $Y$ in the following way:

$$\sum_{y}^{\infty}(y-E(Y))^{2}\frac{\eta^{-\alpha}(\frac{1}{\eta}+1)^{-\alpha-y}\Gamma(y+\alpha)}{y!\Gamma(\alpha)}$$

Unfortunately, the result is very complicated and I don't get the same numerical result for some value of $\alpha$ and $\eta$. For example: $\alpha=0.2$, $\eta=0.9$ gives $0.469136$ using the law of total variance, but the previous equation gives $0.342$.

Did I miss something?

Best Answer

There are two commonly used parametrizations for the exponential distribution: one is by scale: $$f_X(x) = \frac{1}{\theta} e^{-x/\theta}, \quad x > 0.$$ Under this parametrization, the expected value is $$\operatorname{E}[X] = \theta.$$ The second parametrization is by rate: $$f_X(x) = \lambda e^{-\lambda x}, \quad x > 0.$$ Under this parametrization, the expected value is $$\operatorname{E}[X] = 1/\lambda.$$ So an analogous pair is also observed for the gamma distribution, which is a generalized exponential distribution. The scale parametrization is $$f_Y(y) = \frac{y^{\alpha-1} e^{-y/\theta}}{\Gamma(\alpha)\theta^\alpha}, \quad y > 0,$$ with expected value $$\operatorname{E}[Y] = \alpha \theta.$$ The rate parametrization is $$f_Y(y) = \frac{\lambda^\alpha y^{\alpha-1} e^{-\lambda y}}{\Gamma(\alpha)}, \quad y > 0,$$ with expected value $$\operatorname{E}[Y] = \alpha/\lambda.$$ By default, Mathematica uses the rate parametrization for the exponential distribution but the scale parametrization for the gamma distribution, which can be a source of potential confusion.

If $\mu \sim \operatorname{Gamma}(\alpha, \eta)$ is parametrized by rate, and $Y \mid \mu \sim \operatorname{Poisson}(\mu)$, then the marginal distribution of $Y$ is $$\begin{align*} f_Y(y) &= \int_{m=0}^\infty \Pr[Y = y \mid \mu = m] f_{\mu}(m) \, dm \\ &= \frac{\eta^\alpha}{y! \, \Gamma(\alpha)} \int_{m=0}^\infty e^{-m} m^y m^{\alpha-1} e^{-\eta m} \, dm \\ &= \frac{\eta^\alpha}{y! \, \Gamma(\alpha)} \cdot \frac{\Gamma(y+\alpha)}{(\eta + 1)^{y+\alpha}} \int_{m=0}^\infty \frac{(\eta+1)^{y+\alpha} m^{y + \alpha-1} e^{-(\eta+1)m}}{\Gamma(y+\alpha)} \, dm \\ &= \binom{y+\alpha-1}{y} \Bigl(\frac{\eta}{\eta+1}\Bigr)^\alpha \Bigl(\frac{1}{\eta+1}\Bigr)^y, \quad y = 0, 1, 2, \ldots. \end{align*}$$ This is a negative binomial distribution with parameters $r = \alpha$ and $p = 1/(\eta+1)$, and has mean $$\operatorname{E}[Y] = \frac{pr}{1-p} = \frac{\alpha}{\eta},$$ and variance $$\operatorname{Var}[Y] = \frac{pr}{(1-p)^2} = \frac{(\eta+1)\alpha}{\eta}.$$ Of course, it is much more straightforward to compute the marginal mean and variance via the law of total expectation and law of total variance as described in your question, but it is instructive to see that the marginal distribution is negative binomial. In the event that $\mu$ is parametrized by scale and not rate, i.e. $\eta = 1/\theta$, then the marginal distribution remains negative binomial but with $p = \theta/(\theta+1)$--that is, the negative binomial "success" and "failure" probabilities are reversed, which should make intuitive sense.

Related Question