Variance – How to Estimate Variance of a Population If Population Mean is Known

samplevariance

I know that we use $\frac1{n-1}\sum\limits_i(x_i – \bar{x})^2$ to estimate the variance of a population. I remember a video from Khan Academy where the intuition given was that our estimated mean is probably a bit off the actual one so the distances $x_i – \bar{x}$ would actually be greater, so we divide by less ($n-1$ instead of $n$) to get a greater value, resulting in a better estimate.
And I remember reading somewhere, that I don't need this correction if I have the actual population mean $\mu$ instead of $\bar{x}$. So I would estimate $\frac1{n}\sum\limits_i(x_i – \mu)^2$
But I can't find it anymore. Is it true? Can someone give me a pointer?

Best Answer

Yes, it is true. In the language of statistics, we would say that if you have no knowledge of the population mean, then the quantity

$$\frac{1}{n-1} \sum_{i=1}^n \left(x_i-\bar{x} \right)^2$$

is unbiased, which simply means that it estimates the population variance correctly on average. But if you do know the population mean, there is no need to use an estimate for it- this is what the $\bar{x}$ serves for-and the finite-sample correction that comes with it.

In fact, it can be shown that the quantity

$$\frac{1}{n} \sum_{i=1}^n \left(x_i-\mu \right)^2$$

is not only unbiased but also has lower variance than the quantity above. This is quite intuitive as part of the uncertainty has now been removed. So we use this one in this situation.

It is worth noting that the estimators will differ very little in large sample sizes and hence they are asymptotically equivalent.