Comparing the class standard mean and variance to the state average and variance

statisticsvariance

Say I'm a classroom teacher and I want to find the average and variance of my class in a recent exam they did. I imagine that I would consider my class the entire population and so when I calculate the variance I would use $n$ as the denominator.

However, what if this same exam is actually set for all students in the state now?

Case A:
In today's big data world, it could quite possibly be that the state average and variance is calculated over all students in the state, and then again $n$ would be used for the denominator in the variance.

Case B:
But what if the state average and variance is calculated by taking say, 10 random samples from every school in the state and then the average and variance calculated over all these 10 samples combined? Then I imagine the denominator would be $n-1$ for the variance.

My question is: If Case B is the actually case, to compare my class average and variance to the state average, should I now be calculating my class variance with $n-1$.

My opinion is that I still use $n$ and it is comparable to the estimate of the variance for the state calculated as in case B. However, i do not know for sure and would like some advice.

Thankyou.

Best Answer

We should distinguish "estimating a variance" based on a random subsample and "calculating a variance" based on all of the data.

If you have $n$ students in your class, and you know the grades for all of them, then you can just calculate the mean and variance using the denominator $n$.

Same for calculating the variance statewide (your case A). Say there are $N$ students in the state and you know all of their grades - then you would just calculate mean and variance using the denominator $N$.

Things are different when you consider case B - you cannot precisely calulate the statewide variance, because you only have a random subsample of all grades. Hence, you need to estimate the variance based on the data of the random subsample. And in this case you use the denominator $N-1$, otherwise your estimate is biased.

It would then make sense to compare the estimated variance for the state to the calculated variance in your class if you do not have all of the data in the state. However, if you can calculate the statewide variance, then it would make more sense to use that for the comparison.

Related Question