[Math] Variance of the sum of Bernoulli Random variables

probabilityrandom variables

$\newcommand{\var}{\operatorname{var}}$

Let $X_{i}$ be a Bernoulli random variable with paramater $p_{i}$ where $p_i$ itself is a random variable that ranges from $0$ to $1$. The expectation of $p_{i}$ is $\rho$, and $X_{i}$ is independent of $X_{j}$ where $i\neq j$

Let $Y=\sum_{i=1}^{n} X_{i}$

Show that $\var(Y)>n\rho(1-\rho)$

This is what I have done so far:

$$\var(Y)=\var\left(\sum_{i=1}^n X_{i}\right) = \sum_{i=1}^n \var(X_{i})$$
$$\var(X_{i})=E[\var(X_{i}\mid p_{i})]+\var(E[X_i \mid p_i])$$
$$=E[p_i-p_i^2]+\var(p_i)$$
$$=\rho-E[p_i^2]+E[p_i^2]-E[p_i]^2$$
$$=\rho-\rho^2$$

So,
$$\var(Y)=\sum_{i=1}^n (\rho-\rho^{2})=n\rho(1-\rho)$$

What am I doing wrong? This is the variance if $p_i$ were a constant. When $p_i$ varies, surely that should increase the variance of $Y$. Any insight would be appreciated!

Best Answer

Apparently, this insight is not correct - I doubt if what the question asks can be proved at all:

When $p_i$ varies, surely that should increase the variance of $Y$.

Intuitively, for some values of $p_i$, $X_i$ has lower variance than a Bernoulli with parameter $\rho$, and for some other values it has a higher variance. But the effects add up in such a way as if $p_i$ is fixed to $\rho$.

Here is an alternate proof:

Since $X_i$ takes only values $0$ and $1$, we have $X_i^2=X_i$ and $E[X_i^2] = E[X_i]$. Therefore, $$\operatorname{Var}(X_i) = E[X_i^2]-(E[X_i])^2 = E[X_i](1-E[X_i])$$ while by the law of total expectation, $$E[X_i] = E[E[X_i|p_i]] = E[p_i] = \rho$$ Hence, $$\operatorname{Var}(X_i) =\rho(1-\rho)$$

Related Question