[Math] Show that sample variance is unbiased and a consistent estimator

statistical-inferencestatistics

I am having some trouble to prove that the sample variance is a consistent estimator. I have already proved that sample variance is unbiased. I understand that for point estimates T=Tn to be consistent if Tn converges in probably to theta. However, I am not sure how to approach this besides starting with the equation of the sample variance. Any help would be greatly appreciated. Thank you in advance.

Best Answer

If one were to assume $X_1,X_2,X_3,\ldots\sim\text{i.i.d. }N(\mu,\sigma^2)$, I would start with the fact that the sample variance has a scaled chi-square distribution. Maybe you'd want to prove that, or maybe you can just cite the theorem saying that is the case, depending on what you're doing.

Let's see if we can do this with weaker assumptions. Rather than saying the observations are normally distributed or identically distributed, let us just assume they all have expectation $\mu$ and variance $\sigma^2$, and rather than independence let us assume uncorrelatedness.

The sample variance is $$ S_n^2 = \frac 1 {n-1} \sum_{i=1}^n (X_i-\bar X_n)^2 \text{ where } \bar X_n = \frac{\sum_{i=1}^n X_i} n. \tag 0 $$ We want to prove $$ \text{for all }\varepsilon>0,\ \lim_{n\to\infty} \Pr(|S_n^2 - \sigma^2| < \varepsilon) = 1. $$ Notice that the MLE for the variance is $$ \frac 1 n \sum_{i=1}^n (X_i-\bar X)^2 \tag 1 $$ and this is also sometimes called the sample variance. The weak law of large numbers says this converges in probability to $\sigma^2$ because it is the sample mean when one's samples are finite initial segments of the sequence $\left\{ (X_i-\bar X)^2 \right\}_{i=1}^\infty$.

The only proof of the weak law of large numbers that I know off the top of my head assumes a finite variance, and here we're talking about $\operatorname{var}(S_n^2)$. This quantity is finite if $\operatorname{E}(X_n^4)<\infty$, which exceeds the assumptions we made.

However, if $X_1,X_2,X_3,\ldots$ are i.i.d. and normally distributed, then we have $$ (n-1) \frac{S_n^2}{\sigma^2} \sim \chi^2_{n-1} $$ (a fact whose proof I've posted in response to several stackexchange questions.) In that case we can rely on the fact that $\operatorname{var}(\chi^2_{n-1}<\infty$, and I suspect we can get a shorter proof by using simple known properties of the chi-square distribution.

So I've omitted lots of details above and possibly left you with some unanswered questions in your mind. Perhaps you can specify what those questions are in comments or as a separate question.

Related Question