Why do these two derivations of the variance of a binomial random variable give different answers

binomial distributiondescriptive statisticsprobabilitystatistics

I am learning AP Statistics, and I have come across the derivation for the variance of a binomial random variable.

The derivation goes something like this:
Suppose you have a binomial random variable $X$ with the probability of success $p$. Then write $X=Y + Y + Y + … + Y=nY$ as the sum of $n$ Bernoulli random variables $Y$ with probability of success $p$. Then, the variance is $\sigma^2_x = \sigma^2_Y + \sigma^2_Y + … + \sigma^2_Y = n\sigma^2_Y = np(1-p)$ which gives the correct answer. However, shouldn't it also be possible to do $\sigma^2_x = \sigma^2_{nY} = n^2\sigma^2_Y = n^2p(1-p)$? Why does this give a different (and incorrect) answer?

Best Answer

In this context ($X$ has binomial distribution with parameters $n$ and $p$) it is essentially wrong to write: $$X=Y+Y+\cdots+Y\tag1$$

This should be:$$X=Y_1+Y_2+\cdots+Y_n\tag2$$We are dealing with $n$ independent (hence distinct) experiments that can succeed or fail.

The variance of sum $(2)$ equals the sum of variances because the $Y_i$ are independent, and this results in $np(1-p)$.

That is not true for sum $(1)$ where independence lacks and the result is $n^2p(1-p)$.

Best Answer

Related Solutions

[Math] Variance of transformed random variable

[Math] Binomial Random Variable and Bernoulli trials problem

Related Question