So i know that the sample variance generic formula is :
$$
S_n^2=\frac{1}{n-1}\sum_{i=1}^n(X_i-\overline{X}_n)^2.
$$
And i have this question :
$Y$ takes values of $0$ and $1$.
$\operatorname{Pr}(Y=1)=p=0.78$
$\operatorname{Pr}(Y=0)=1-p=0.22$
I do not understand why , in order to calculate the variance, i can and should use the formula:
$\operatorname{Var}(Y)= E(Y-E(Y))^2$ which is the same as $p(1-p)$.
What are these formulas? How come one is equal to another? How to we get to them ?
Can somebody give me a simple explanation on why the variance on this problem is $p(1-p$) ?
The answer i get is $0.78 \times (1-0.78)= 0.1716$.
Best Answer
If you take $n$ very large, you will expect $np$ times $Y=1$ and $n(1-p)$ times $Y=0$.
The average value is thus
$$\overline Y=\frac{np\,1+n(1-p)\,0}{n}=p.$$
From this, the variance,
$$\sigma_Y^2=\frac1{n-1}\left(np(1-p)^2+n(1-p)(0-p)^2\right)=\frac{np(1-p)}{n-1}.$$
As $n$ tends to infinity, the formula simplifies as
$$\sigma_Y^2=p(1-p).$$