[Math] If $X_1, \dots , X_n$ are iid Poisson($\lambda$), find the posterior distribution of $\lambda$ if the prior on $\lambda$ is gamma($\alpha, \beta$)

bayesianpoisson distributionprobability distributionsstatistics

Let $X_1,…,X_n$ be iid Poisson($\lambda$), and let $\lambda$ have a gamma($\alpha, \beta$) prior distribution. Find the posterior distribution of $\lambda$

This is how I first attempted this problem. Since $f(x|\lambda)=\dfrac{e^{-\lambda}\lambda^x}{x!}$ and $f(\lambda) = \dfrac{\lambda^{\alpha -1}e^{-\lambda / \beta}}{\Gamma(\alpha)\beta^\alpha}$ then this means that $ f(x, \lambda) = f(x|\lambda)\cdot f(\lambda) = \dfrac{\lambda^{x+\alpha -1}e^{-\lambda (\frac{\beta +1}{\beta})}}{x!\Gamma(\alpha)\beta^\alpha } $. We are asked to find the posterior of $\lambda$ so we need to find $ f(\lambda | x) = \dfrac{f(x|\lambda)f(\lambda)}{f(x)} $ so we now need to find $f(x)$.

$ f(x) = \int_0^\infty f(x,\lambda)d\lambda = \dfrac{\Gamma(x+\alpha)(\frac{\beta}{\beta +1})^{x+\alpha}}{x!\Gamma(\alpha)\beta^\alpha} \; \; x=0,1,2,3,… $

Therefore $ f(\lambda |x) = \dfrac{e^{-\lambda (\frac{\beta+1}{\beta})}\lambda^{x+\alpha -1}}{\Gamma(x+ \alpha)(\frac{\beta}{\beta +1})^{x + \alpha}} $ which means that $\lambda |x$ has a gamma($x + \alpha, \frac{\beta}{\beta +1}$) distribution.

But this isnt the correct answer given in the solution manual. The first step in the solution manual is to let $Y=\sum X_i$ which follows a Poisson($n \lambda$) distribution and then do the same exact procedure as I did above except you use $f(y|\lambda)=\dfrac{e^{-n\lambda}(n\lambda)^x}{x!}$ and $f(\lambda)$ is still the same. The answer you get by doing it this way is $\lambda | y$ follows a gamma($x+\alpha, \frac{\beta}{n\beta +1}$).

My question is why do we first start by letting $Y=\sum X_i$? Is this the standard procedure for finding the posterior distribution when given a random sample?

Best Answer

As already mentioned in the comments, you appear to have some fundamental misconceptions about Bayesian inference, and this is what I intend to address in my response, since you've already received enough computational guidance on your question.

If you were given a single observation $X_1 \mid \lambda \sim \operatorname{Poisson}(\lambda)$, and you wanted to use a Bayesian approach to make an inference about the posterior distribution of $\lambda$ based on this observation and your prior beliefs, then this is intuitively and obviously different than if you had many observations in your sample. Except in certain (unusual) circumstances, the more data you are able to observe from some parametric model, the "better" your understanding about what parameter(s) generated that data.

After all, this principle is something you should be familiar with from frequentist statistics: if you suppose that your observations are generated from a normal distribution with unknown mean $\mu$ and known standard deviation $\sigma = 1$, the sample mean is an intuitive and reasonable estimator for $\mu$; moreover, the larger your sample, the more likely this estimator will tend toward the true value of $\mu$. The same principle applies in Bayesian inference.

With this in mind, it should become clear that the likelihood function for $\lambda$ given the sample, should reflect the size of the sample in some meaningful way. Your version of the likelihood fails to do that. Now, you're not obligated to consider the entire sample--effectively, by using only a single observation, you're saying "I'm going to ignore the rest of the information available to me when calculating the posterior distribution"--but the posterior you calculate will not be as informative as one that uses all of the data. Another statistician, seeing your calculation, would not hesitate to improve upon it by using the other observations to produce a superior inference on $\lambda$.

Clearly, you have some mathematical competence when working with the calculation itself, but there's a deeper conceptual gap that I would invite you to try to close. Applying formulas is easy. Understanding what the formulas mean, relating it to the underlying concepts, is the essence of statistical thinking.