Solved – Difference between sample variance and statistic variance

sampleself-studyvariance

Conceptually we learn that the sample variance formula, we have to divide by n-1 such that it gives the correct estimation for the true variance.

However if we have this example,

let a sample $X_i = 1$ with probability $p$, and $0$ with probability $1-p$.

so if we define a statistic $\hat{p} = \frac{1}{N} \sum X_i$,

we find that $E(\hat{p}) = \frac{1}{N} N p = p$; and

working from first principle we can find that variance of $\hat{p}$

$= \textrm{var}(\frac{1}{N} * \sum X_i)$

$= \frac{1}{N^2} Np(1-p)$

$= \frac{p(1-p)}{N}$

So my question is that is this variance obtained the same as the sample variance formula (since there is no divide by n-1 term)? If not which should be the correct answer?

Thank you.

Edit: Is it because the N-1 formula is talking about the sample variance, and my example we are actually looking at the variance of the sample mean (i.e. sample mean variance)?

Best Answer

In your example, you are indeed confusing the variance of the mean with the variance of the population. To understand where the $N-1$ factor is coming from, you would have to follow the geometry of a multivariate normal distribution and Cochran's theorem that describes the decomposition of the sums of squares.