Solved – Unbiased estimator of variance for a sample drawn from a finite population without replacement

mathematical-statisticsunbiased-estimatorvariance

Previously, I do believe $S^2$ is an unbiased estimator of $\sigma^2$

$$S^2 = \frac{1}{n-1}\sum_{i=1}^n{\left(X_i-\bar{X}\right)^2}$$

is a correct conclusion.

However, I found the following statement:

Considering the sample variance:

$$s^2 = \frac{1}{n-1}\sum_{i=1}^{n}\left(y_i -\bar{y}\right)^2$$

it can be shown (see Appendix A, Derivations) that

$$E(s^2) = \frac{N}{N-1}\sigma^{2}$$

This is an example based on simple random sample without replacement. It says $S^2$ is a biased estimator of $\sigma^2$.

So I am wondering "$S^2$ is an unbiased estimator of $\sigma^2$" can only be applied to some specific cases? How to understand this result based on simple random sample?

Best Answer

When sampling from a finite population without replacement, the observations are negatively correlated with each other, and the sample variance $s^2 = \frac{1}{n-1} \sum_i \left( x_i - \bar{x} \right)^2$ is a slightly biased estimate of the population variance $\sigma^2$.

The derivation in this link from Robert Serfling provides a clear explanation of what's going on. The author first proves that if the observations in a sample have constant covariance (i.e. $\mathrm{Cov}\left(x_i, x_j \right) = \gamma$ for all $i\neq j$) that: $$ E[s^2] = \sigma^2 - \gamma$$

For independent draws (hence $\gamma = 0$), you have $E[s^2] = \sigma^2$ and the sample variance is an unbiased estimate of the population variance. But the issue you have with sampling without replacement from a finite population is that your draws are negatively correlated with each other!

In the case of sampling without replacement from a population of size $N$: $$ \text{For $i\neq j$ }\quad \mathrm{Cov}\left(x_i, x_j \right) = \frac{-\sigma^2}{N-1}$$ Hence: $$ E\left[s^2\right] = \frac{N}{N-1}\sigma^2 $$

Related Question