Assume you have $x_i \sim \operatorname{Bernoulli}(p_i)$ with $p_i \sim \operatorname{Beta}(\alpha,\beta)$.
I am exploring $Z=X_1+ \dots +X_n$
According to this page, it is
$Z \sim \operatorname{BetaBinomial}(n,\alpha,\beta)$
but according to this page, with simulation, it is $Z \sim \operatorname{Binomial}(n,\frac{\alpha}{\alpha+\beta})$
Two different distributions. Which one is correct?
Also equally important, what is the difference in assumptions for each? So I know when to use them correctly and can simulate either one in the right context. Thanks!
Best Answer
Short summary: if the $p_i$s are independent, it's the binomial. If the $p_i$s are all equal, it's the beta-binomial.
By $X_i \sim \textrm{Bernoulli}(p_i)$, you must mean the conditional distribution $X_i\mid p_i \sim \textrm{Bernoulli}(p_i)$. The marginal distribution of $X_i$ (that is, the distribution obtained by averaging over different values of $p_i$s) is obtained, e.g., by noting that $X_i$ is Bernoulli, and computing the expectation using the tower law: \begin{equation} \mathbb{E}(X_i) = \mathbb{E}(\mathbb{E}(X_i \mid p_i)) = \mathbb{E}(p_i) = \frac{\alpha}{\alpha+\beta}. \end{equation} So, $X_i \sim \textrm{Bernoulli}(\frac{\alpha}{\alpha+\beta})$, for all $i$. However, this does not yet determine the distribution of $Z$ as you have not specified enough information to deduce the joint distribution of $X_i$s. Two additional things are needed:
For the first part, I guess&assume you intend the $X$s to be conditionally independent given the $p$s. The difference of the two distributions mentioned in your question stems from the second point.
If we assume the $p$s to be mutually independent, then the $X$s will be mutually independent, too, as each $X$ depends on only one $p$ and the $X$s are conditionally independent given the $p$s. Then, $Z$ is just the sum of iid. Bernoulli random variables. But this is the definition of the binomial distribution, and thus indeed \begin{equation} Z \sim \textrm{Binomial}\left(n,\frac{\alpha}{\alpha+\beta}\right). \end{equation} Note that in this case it did not make much sense to define the $p$s in the first place: if each $p_i$ influences only the corresponding $X_i$, nothing is learned about the distribution of $p_i$ except via the value of $X_i$. Thus, the $p_i$s are useless in the sense that the exactly same model would have been easier to specify by just stating that $X_i$s are independent Bernoulli random variables with parameter $\alpha/(\alpha+\beta)$.
The beta-binomial distribution is actually defined so that there is only one $p$ parameter drawn from the beta distribution, and then all $X_i$s are Bernoulli with this $p$. This is obtained as a special case of your question by stating in my "step 2" that the joint distribution of the $p_i$s are all the same.
By defining some other dependence structures for the $p_i$s (not independent, but not constrained to be equal either), other distributions for $Z$ would be obtained, but I don't know if any of these have special names.