Solved – Variance of a sample – proof

proofstandard deviationvariance

On page 72 of Introductory Statistics, A Conceptual Approach Using R (Routledge, 2012), the authors first compute the variance of a sample of size $n$ using:

$$\sigma^2=\dfrac{\sum_{i=1}^n(Y_i-\mu)^2}{n}$$

Then, because they do not know the mean $\mu$ of the population, they replace it with the sample mean $\overline{Y}$:

$$\hat{\sigma}^2=\dfrac{\sum_{i=1}^n(Y_i-\overline{Y})^2}{n}$$

Next they say they use "expectation algebra" to show that:

$$E(\hat{\sigma}^2)=\sigma^2-\frac{\sigma^2}{n}$$

I've tried a number of things. For example, I tried:

$$\begin{align*}
E(\hat{\sigma}^2)
&=E\left[\frac{\sum(Y-\overline{Y})^2}{n}\right]\\
&=\frac1n E\left[\sum Y^2-2\overline{Y}\sum Y+\sum\overline{Y}^2\right]\\
&=\frac1n E\left[\sum Y^2-n\overline{Y}^2\right]\\
&=\frac1nE\left[\sum Y^2\right]-\overline{Y}^2
\end{align*}$$

But I have been unable to make this equal to $\sigma^2-\sigma^2/n$. Any suggestions would be helpful, allowing me to continue my reading.

Best Answer

I didn't check that reference, but I guess they are assuming that $Y_i$'s are independent with $E(Y_i)=\mu$ and $Var(Y_i)=\sigma^2$ for $i=1,2,...,n$ i.e. all the observation has the same (finite) mean $\mu$ and (finite) variance $\sigma^2$. So first note that $E(Y_i^2)=Var(Y_i)+E^2(Y_i)=\sigma^2+\mu^2$. Also for $\bar{Y}=\dfrac{\sum_{i=1}^n Y_i}{n}$ we have: $E(\bar{Y})=\dfrac{\sum_{i=1}^n E(Y_i)}{n}=\dfrac{n\mu}{n}=\mu$. In addition, by using independency among $Y_i$'s, we have: $Var(\bar{Y})=\dfrac{\sum_{i=1}^n Var(Y_i)}{n^2}=\dfrac{n\sigma^2}{n^2}=\dfrac{\sigma^2}{n}$. Now it is easy to find $E(\bar{Y}^2)=Var(\bar{Y})+E^2(\bar{Y})=\sigma^2/n+\mu^2$. You should take an expectation from $\bar{Y}^2$ in the last line you wrote as well, i.e. $E(\hat{\sigma}^2)=\dfrac{1}{n}E(\sum_{i=1}^n Y_i^2)-E(\bar{Y}^2)=\dfrac{1}{n}.n.E(Y_i^2)-\sigma^2/n-\mu^2$. Now replace $E(Y_i^2)=\sigma^2+\mu^2$ to get $E(\hat{\sigma}^2)=\sigma^2-\sigma^2/n$.

Related Question