Sampling Distribution of Variances

probabilitystatistics

Please consider this problem and my solution to it. I get the feeling, my approach is way off.
Problem:
A normal population has a variance of $15$. If samples of size $5$ are
drawn from this population, what percentage an be expected to have
variances less than $10$.
Answer
The sample variance has a chi-square distribution with $4$ degrees of freedom. Also observe that $\frac{10}{15} = 0.666667$. That is, you need to adjust for the population variance. I then went to this website:
      https://stattrek.com/online-calculator/chi-square.aspx
and I entered $4$ for the degrees of freedom and $0.666667$ for the Chi-Square
critical value. I then got an answer of $0.05$ but the book gets an answer of
$0.50$.
What did I do wrong?
Thanks,
Bob

Best Answer

Both look wrong to me. Your error is that the sample variance is proportional to a chi-square distribution with $4$ degrees of freedom, and you have missed this proportionality of $4$ or $5$ depending on how you calculate sample variances; you should try $4 \times \frac{10}{15}$ or $5 \times \frac{10}{15}$ rather than just $\frac{10}{15}$ as the value you are testing

If $X_1,X_2,\ldots,X_n$ were i.i.d. $\sim N(\mu,1)$, then I would have thought that $$\sum_i (X_i-\bar X)^2 = (n-1)\times \frac{1}{n-1}\sum_i (X_i-\bar X)^2 \sim \chi^2_{n-1}$$ so using R, I would have thought you would answer this question with

pchisq((5-1) * 10/15, df=5-1)
[1] 0.38494

A simulation seems to produce a close figure:

library(matrixStats)
set.seed(1)
cases     <- 1000000
n         <- 5
popvar    <- 15
critvalue <- 10
matdat <- matrix(rnorm(cases*n, mean=0, sd=sqrt(popvar)), ncol=n)
samplevars <- rowVars(matdat)
mean(samplevars < critvalue)
[1] 0.384494

But if the book's definition of sample variance is instead $\frac{1}{n}\sum_i (X_i-\bar X)^2$ rather than R's $\frac{1}{n-1}\sum_i (X_i-\bar X)^2$ then these would indeed become closer to $0.50$

pchisq(5 * 10/15, df=5-1)
[1] 0.4963317

samplevars_n <- rowVars(matdat) * (n-1)/n
mean(samplevars_n < critvalue)
[1] 0.495974
Related Question