Beta-Binomial Distribution – Marginal Distribution in Beta-Binomial Model with Overdispersion Parameters

bayesianbeta-binomial distributionconditional probabilityprobabilityregression

Problem Setting:
We assume there is a sequence of binomial trials of size $N_i$, $Y_i$ is the number of events of interest, $x_i$ is the predictor associated with trial $i$, and $\pi_i$ is the proportion of the events of interest ($i = 1,2,…,n$). We know that this situation can be fitted by a Beta-Binomial regression model with an overdispersion parameter in case the data doesn't have binomial variance.

A Beta-Binomial regression model with an overdispersion parameter has the framework:
$$
P_i|\pi, \tau_i^2 \sim Beta(a_i, b_i), \quad \text{where} \quad \pi_i = \frac{a_i}{a_i + b_i} \quad \text{and} \quad \tau_i^{2} = \frac{1}{a_i + b_i + 1}
$$

$$
Y_i|P_i \sim Binomial(N_i, P_i)
$$

$$
\pi_i = \frac{e^{\beta_0 + \beta_1 x_{i}}}{1 + e^{\beta_0 + \beta_1 x_{i}}}, \quad \text{for} \quad i = 1,2,…,n
$$

I am wondering how I can obtain the $E(P_i)$ and $Var(P_i)$ in terms of $\pi_i$ and $\tau_i^2$. Eventually, I can obtain the marginal mean and variance for $Y_i$.

Attempt:
I notice that I can use the law of total expectation and the law of total variance to obtain the marginal mean and variance of $P_i$
$$
E(P_i) = E[E(P_i|\pi, \tau_i^2)] \quad \text{and} \quad Var(P_i) = E[Var(P_i|\pi, \tau_i^2)] + Var[E(P_i|\pi, \tau_i^2)]
$$

But I don't know which parts I should put into the formula in terms of $\pi_i$ and $\tau_i^2$. Does anyone help me to figure out it?

Best Answer

Using well known formulas for the mean and variance of a beta distributed random variable, in terms of $\pi_i$ and $\tau_i^2$, $$ E(P_i)=\frac{a_i}{a_i+b_i}=\pi_i $$ and $$ \operatorname{Var}P_i=\frac{a_ib_i}{(a_i+b_i)^2(a_i+b_i+1)}=\pi_i(1-\pi_i)\tau_i^2. $$ Since $Y_i|P_i\sim \operatorname{bin}(N_i,P_i)$, using the law of total expectation, $$ E(Y_i) = E E(Y_i|P_i) = EN_i P_i=N_i \pi_i $$ and, using the law of total variance, \begin{align} \operatorname{Var}Y_i&=\operatorname{Var}E(Y_i|P_i)+E\operatorname{Var}(Y_i|P_i) \\&=\operatorname{Var}N_i P_i+E(N_i P_i(1-P_i)) \\&=N_i^2\pi_i(1-\pi_i)\tau_i^2+N_i[EP_i-(EP_i)^2-\operatorname{Var}P_i] \\&=N_i^2\pi_i(1-\pi_i)\tau_i^2+N_i[\pi_i(1-\pi_i)-\pi_i(1-\pi_i)\tau_i^2] \\&=(\tau_i^2 N_i + 1-\tau_i^2)N_i\pi_i(1-\pi_i). \end{align} As sanity checks, note that as $\tau_i^2$ tends to zero $\operatorname{Var} Y_i$ tends to the binomial variance. Similarly, as $a_i$ and $b_i$ tends to zero, $\tau_i^2$ tends to one and $\operatorname{Var}Y_i$, as expected, tends to $N_i^2\pi_i(1-\pi_i)$, the variance of $N_i$ times a Bernoulli random variable.

Related Question