Sample Variance Distribution – Analyzing Variance from an Unknown Distribution

estimationvariance

I am sampling from a parameter with unknown distribution. I would like to calculate a 95% CI for the standard deviation of the sample.

@cardinal provides a nice general solution for calculating a CI in his [answer] to my previous question, Calculating required sample size, precision of variance estimate?. And @erik-p provides an estimate of the standard deviation of the variance of the sample.

However, in order to calculate the 95%CI for the sample variance, it seems that I need to know the distribution of the sample variance. Is it possible to calculate such an estimate without knowing the distribution from which the sample was taken?

A related question is Reference for $\mathrm{Var}[s^2]=\sigma^4 \left(\frac{2}{n-1} + \frac{\kappa}{n}\right)$?

Best Answer

Without knowing the population distribution you cannot know the exact distribution of the sample variance. However each squared deviation from the mean has the same distribution and they are averaged and only weakly dependent. So for large n the sample variance is approximately normally distributed with mean σ$^2$ and variance as given above. So you can use the normal approximation to get an approximate 95% confidence interval. On the other hand if n is small and the CLT cannot be applied you can generate bootstrap confidence intervals. But keep in mind that in my paper with LaBudde I showed that for highly skewed distributions such as the lognormal the bootstrap intervals will severely undercover in the small sample size situation. But it will work fine for symmetric distributions and distributions that have mild skew.

Related Question