Distribution of sample variance of independent but not identically distributed normals

chi-squared-distributionmathematical-statisticsnormal distribution

When I am reading the Wikipedia page on the chi-squared distribution, it states that if $X_1, \ldots, X_n$ are $\text{N}(\mu, \sigma^2)$, then $\sum^n_{i=1} (X_i – \bar{X})^2 \sim \sigma^2 \chi^2_{n-1}$ where $\bar{X} = \frac{1}{n} \sum^n_{i=1} X_i$ is the sample mean.

I am wondering if a similar result holds when we do not assume the variances to be the same, that is when $X_i \sim \text{N}(\mu, \sigma_i^2)$ —i.e., whether we should have something like

$$\sum^n_{i=1} (X_i – \bar{X})^2 = \text{(some combination of } \sigma_i^2 \text{)} \times \chi_{n-1}^2.$$

My sense is that it should, but I cannot come up with a proof about it.

Best Answer

You can find a general result about the quadratic form of a normal random vector in this related question. In the case where the the normal random variables are independent wth common mean but different variances, the quadratic form will be a weighted sum of chi-squared-one random variables:

For the random variables specified in your question, you can group them into the normal random vector:

$$\mathbf{X} \sim \text{N}(\mu \mathbf{1}, \boldsymbol{\Sigma}) \quad \quad \quad \quad \quad \boldsymbol{\Sigma} = \text{diag}(\sigma_1^2,...,\sigma_n^2).$$

Using the centering matrix $\mathbf{C}$ we have $\mathbf{X} - \bar{\mathbf{X}} = \mathbf{C} \mathbf{X}$. Moreover, it is simple to show that the matrix $\mathbf{C}$ is symmetric and idempotent, so that $\mathbf{C}^\text{T} \mathbf{C} = \mathbf{C}$. (For detailed information on the centering matrix, see e.g., O'Neill 2020, esp. sections 3-4.) For this analysis I'm also going to use the standard normal random vector $\mathbf{Z} \sim \text{N}(\mathbf{0}, \mathbf{I})$. Using these objects, we can write the quadratic form of interest as:

$$\begin{align} \sum_{i=1}^n (X_i - \bar{X})^2 &= (\mathbf{X} - \bar{\mathbf{X}})^\text{T} (\mathbf{X} - \bar{\mathbf{X}}) \\[6pt] &= (\mathbf{C} \mathbf{X})^\text{T} (\mathbf{C} \mathbf{X}) \\[6pt] &= \mathbf{X}^\text{T} \mathbf{C}^\text{T} \mathbf{C} \mathbf{X} \\[6pt] &= \mathbf{X}^\text{T} \mathbf{C} \mathbf{X} \\[6pt] &= (\mathbf{X} - \mu \mathbf{1})^\text{T} \mathbf{C} (\mathbf{X} - \mu \mathbf{1}) \\[6pt] &= (\boldsymbol{\Sigma}^{1/2} \mathbf{Z})^\text{T} \mathbf{C} (\boldsymbol{\Sigma}^{1/2} \mathbf{Z}) \\[6pt] &= \mathbf{Z}^\text{T} \boldsymbol{\Sigma}^{1/2} \mathbf{C} \boldsymbol{\Sigma}^{1/2} \mathbf{Z} \\[6pt] &\sim \sum_{i=1}^n \lambda_i \cdot \chi_1^2, \end{align}$$

where $\lambda_1,...,\lambda_n$ are the eigenvalues of the matrix $\boldsymbol{\Sigma}^{1/2} \mathbf{C} \boldsymbol{\Sigma}^{1/2}$. This latter matrix is given by:

$$\begin{align} \boldsymbol{\Sigma}^{1/2} \mathbf{C} \boldsymbol{\Sigma}^{1/2} &= \text{diag}(\sigma_1,...\sigma_n) \ \mathbf{C} \ \text{diag}(\sigma_1,...\sigma_n) \\[6pt] &= \begin{bmatrix} \tfrac{n-1}{n} \sigma_1^2 & -\tfrac{1}{n} \sigma_1 \sigma_2 & \cdots & -\tfrac{1}{n} \sigma_1 \sigma_n \\ -\tfrac{1}{n} \sigma_1 \sigma_2 & \tfrac{n-1}{n} \sigma_2^2 & \cdots & -\tfrac{1}{n} \sigma_2 \sigma_n \\ \vdots & \vdots & \ddots & \vdots \\ -\tfrac{1}{n} \sigma_1 \sigma_n & -\tfrac{1}{n} \sigma_2 \sigma_n & \cdots & \tfrac{n-1}{n} \sigma_n^2 \\ \end{bmatrix}. \\[6pt] \end{align}$$

You can take the above matrix form and use it to compute the eigenvalues $\lambda_1,...,\lambda_n$ for your standard deviation values $\sigma_1,...,\sigma_n$. This then gives you the weightings for the weighted sum of chi-squared-one random variables.

Related Solutions

Chi-squared confidence interval for variance

Because the chi-squared distribution is skewed, the sample variance is not generally at the center of a 95% CI for the variance (for normal data).

You are correct to say that you can often get a narrower interval by taking something like probability 2% from one tail and 3% from the other, than by taking 2.5% from each tail.

For practical purposes, the narrowest 95% interval may put almost all of the 5% probability in one tail, thus becoming nearly a one-sided interval. This may or may not be useful.

Thus, it has become more or less standard to use probability-symmetric intervals in general practice. If you are not showing a probability-symmetric interval, it is a good idea to report that you are not, and to explain why.

Example: Consider a normal sample of size $n=20$ with variance $\sigma^2 = 25.$

set.seed(2022)
x = rnorm(20, 50, 5)
v = var(x);  v
[1] 25.01484

Seven 2-sided 95% CIs for $\sigma^2$ and their widths:

CI.1 = 19*v/qchisq(c(.97, .02), 19)
CI.1; diff(CI.1)
[1] 14.77971 55.47799
[1] 40.69828

CI.2 = 19*v/qchisq(c(.975, .025), 19)
CI.2; diff(CI.2)
[1] 14.46722 53.36339
[1] 38.89617    # probability-symmetric

CI.3 = 19*v/qchisq(c(.98, .03), 19)
CI.3; diff(CI.3)
[1] 14.10859 51.65860
[1] 37.55002

CI.4 = 19*v/qchisq(c(.99, .04), 19)
CI.4; diff(CI.4)
[1] 13.13265 49.00681
[1] 35.87417

CI.5 = 19*v/qchisq(c(.995, .045), 19)
CI.5; diff(CI.5)
[1] 12.31867 47.93333
[1] 35.61466   # shortest on this list           

CI.6 = 19*v/qchisq(c(.999, .049), 19)
CI.6; diff(CI.6)
[1] 10.84618 47.16119      
[1] 36.31501   # longer than above

CI.7 = 19*v/qchisq(c(.99999, .04999), 19)
CI.7; diff(CI.7)
[1]  8.284141 46.980289
[1] 38.69615   # 'almost' one sided

Note: The relevant one-sided 95% CI would give the upper bound $46.97848.$ Depending on the application, that might be exactly what you want.

Moments of sum of squares of independent gaussians $X_i \sim \mathcal{N}(\mu_i,\sigma^2_i)$, or $||X||^2$

As pointed out in the comments, the answer is obtained using basic probability.

We have that $\mathbb{E}\left[ X_i^2 \right] = \mathrm{Var}(X_i) + \mathbb{E}\left[X_i \right]^2$. Also, we have that, for independent $Y_i$, $\mathbb{E}\left[ \sum_i Y_i \right] = \sum_i \mathbb{E} \left[ Y_i \right]$. Putting these two facts together, for $X \sim \mathcal{N}\left(\mu, \Lambda \right)$ where $\Lambda$ is a diagonal covariance matrix with elements $\Lambda_{ii} = \sigma_i^2$:

$$\mathbb{E}\left[ ||X||^2 \right] = \mathbb{E}\left[ \sum_i^n X_i^2 \right] = \sum_i^n (\sigma_i^2 + \mu_i^2).$$

Best Answer

Related Solutions

Chi-squared confidence interval for variance

Moments of sum of squares of independent gaussians $X_i \sim \mathcal{N}(\mu_i,\sigma^2_i)$, or $||X||^2$

Related Question