[Math] Can we compute confidence intervals for the variance of an unknown distributions from sample variances

estimation-theorystatistics

Assume $X_1,\ldots,X_n$ are i.i.d. with unknown distribution $\mathcal D$ – we only know it is not normal and has finite variance.

Is there a way to give confidence intervals for the variance of $\mathcal D$?
Can we base the confidence interval on the sample variance $\hat \sigma^2 = \frac1{n-1}\sum_{i=1}^n (X_i – \bar X)^2$ with $\bar X = \frac1n\sum_{i=1}^n X_i$ the sample mean?

If it helps, we can restrict ourselves to distributions with mean $0$. (From my application, I can compute the exact mean $\mu$ of $\mathcal D$ and then consider $Y_i := X_i – \mu$ to estimate the variance.)

I know the approaches via $\chi^2$-distribution if we assume a normal distribution $\mathcal D = \mathcal N(\mu,\sigma^2)$, but I as noted above $\mathcal D$ is not normal.

Best Answer

In practical terms, if the distribution is unknown and one has a lot of data, one can assume that the distribution of the sample variance converges to a Gaussian one (e.g. see here). Then the confidence interval can be computed from this.

One can also do bootstrapping to approximate the statistic's distribution, and use it to estimate the confidence interval. This is (asymptotically) quite accurate.

Another method might be to use a Bayesian approach (see: A Bayesian perspective on estimating mean, variance, and standard-deviation from data by Oliphant). (This method is built into scipy already :].) Essentially, it finds that, with an "ignorant" Bayesian prior, the sample variance follows an inverted Gamma distribution, from which confidence intervals can be constructed.

See also this question about the distribution of the sample variance and this one, which is also related.

Related Question