Understanding the formula of sample variance

variance

So i know that the sample variance generic formula is :

$$
S_n^2=\frac{1}{n-1}\sum_{i=1}^n(X_i-\overline{X}_n)^2.
$$

And i have this question :

$Y$ takes values of $0$ and $1$.

$\operatorname{Pr}(Y=1)=p=0.78$

$\operatorname{Pr}(Y=0)=1-p=0.22$

I do not understand why , in order to calculate the variance, i can and should use the formula:

$\operatorname{Var}(Y)= E(Y-E(Y))^2$ which is the same as $p(1-p)$.

What are these formulas? How come one is equal to another? How to we get to them ?

Can somebody give me a simple explanation on why the variance on this problem is $p(1-p$) ?

The answer i get is $0.78 \times (1-0.78)= 0.1716$.

Best Answer

If you take $n$ very large, you will expect $np$ times $Y=1$ and $n(1-p)$ times $Y=0$.

The average value is thus

$$\overline Y=\frac{np\,1+n(1-p)\,0}{n}=p.$$

From this, the variance,

$$\sigma_Y^2=\frac1{n-1}\left(np(1-p)^2+n(1-p)(0-p)^2\right)=\frac{np(1-p)}{n-1}.$$

As $n$ tends to infinity, the formula simplifies as

$$\sigma_Y^2=p(1-p).$$

Related Question