Chi-Squared Distribution – Sum of Squares of Normals with Mean Zero but Different Variance

chi-squared-distributiondistributionsnormal distributionsums-of-squaresvariance

I am trying to find the distribution of a random variable that is calculated according to $Y:=\sum_{i=1}^n X_i^2$ where $X_i $ is distributed as $ \mathcal{N}(0,\sigma^2_i)$. Does there exist a particular of calculating this?

Thank you so much!

Best Answer

$X_i \sim \mathcal{N}(0,\sigma^2_i) \Rightarrow \frac{X_i}{\sigma_i}\sim \mathcal{N}(0,1) $

$\therefore$ $\frac{X_i^2}{\sigma_i^2} \sim \chi^2(1)=\Gamma(1/2,2)$

$X_i^2 \sim \sigma_i^2\Gamma(1/2,2)=\Gamma(1/2,2\sigma_i^2)$

If your $\sigma_i$s are fixed (i.e all the same) then

$\sum_{i=1}^n X_i^2 \sim \sum_{i=1}^n\Gamma(1/2, 2\sigma^2)=\Gamma(n/2,2\sigma^2)$ suppose $\sigma_i$s are equal to $\sigma$

i.e $\sum_{i=1}^n X_i^2$ has a gamma distribution with $k=n/2,\theta=2\sigma^2$

If your $\sigma_i$s are not fixed then

ref this

Related Solutions

Regression – Distribution of Sum of Squares Error

We can prove this for more general case of $p$ variables by using the "hat matrix" and some of its useful properties. These results are usually much harder to state in non matrix terms because of the use of the spectral decomposition.

Now in matrix version of least squares, the hat matrix is $H=X(X^TX)^{-1}X^T$ where $X$ has $n$ rows and $p+1$ columns (column of ones for $\beta_0$). Assume full column rank for convenience - else you could replace $p+1$ by the column rank of $X$ in the following. We can write the fitted values as $\hat{Y}_i=\sum_{j=1}^nH_{ij}Y_j$ or in matrix notation $\hat{Y}=HY$. Using this, we can write the sum of squares as:

$$\frac{\sum_{i=1}(Y-\hat{Y_i})^2}{\sigma^2}=\frac{(Y-\hat{Y})^T(Y-\hat{Y})}{\sigma^2}=\frac{(Y-HY)^T(Y-HY)}{\sigma^2}$$ $$=\frac{Y^T(I_n-H)Y}{\sigma^2}$$

Where $I_n$ is an identity matrix of order $n$. The last step follows from the fact that $H$ is an idepotent matrix, as $$H^2=[X(X^TX)^{-1}X^T][X(X^TX)^{-1}X^T]=X(X^TX)^{-1}X^T=H=HH^T=H^TH$$

Now a neat property of idepotent matrices is that all of their eigenvalues must be equal to zero or one. Letting $e$ denote a normalised eigenvector of $H$ with eigenvalue $l$, we can prove this as follows:

$$He=le\implies H(He)=H(le)$$ $$LHS=H^2e=He=le\;\;\; RHS=lHe=l^2e$$ $$\implies le=l^2e\implies l=0\text{ or }1$$

(note that $e$ cannot be zero as it must satisfy $e^Te=1$) Now because $H$ is idepotent, $I_n-H$ also is, because

$$(I_n-H)(I_n-H)=I-IH-HI+H^2=I_n-H$$

We also have the property that the sum of the eigenvalues equals the trace of the matrix, and $$tr(I_n-H)=tr(I_n)-tr(H)=n-tr(X(X^TX)^{-1}X^T)=n-tr((X^TX)^{-1}X^TX)$$ $$=n-tr(I_{p+1})=n-p-1$$

Hence $I-H$ must have $n-p-1$ eigenvalues equal to $1$ and $p+1$ eigenvalues equal to $0$.

Now we can use the spectral decomposition of $I-H=ADA^T$ where $D=\begin{pmatrix}I_{n-p-1} & 0_{[n-p-1]\times[p+1]}\\0_{[p+1]\times [n-p-1]} & 0_{[p+1]\times [p+1]}\end{pmatrix}$ and $A$ is orthogonal (because $I-H$ is symmetric) . A further property which is useful is that $HX=X$. This helps narrow down the $A$ matrix

$$HX=X\implies(I-H)X=0\implies ADA^TX=0\implies DA^TX=0$$ $$\implies (A^TX)_{ij}=0\;\;\;i=1,\dots,n-p-1\;\;\; j=1,\dots,p+1$$

and we get:

$$\frac{\sum_{i=1}(Y-\hat{Y_i})^2}{\sigma^2}=\frac{Y^TADA^TY}{\sigma^2}=\frac{\sum_{i=1}^{n-p-1}(A^TY)_i^2}{\sigma^2}$$

Now, under the model we have $Y\sim N(X\beta,\sigma^2I)$ and using standard normal theory we have $A^TY\sim N(A^TX\beta,\sigma^2A^TA)\sim N(A^TX\beta,\sigma^2I)$ showing that the components of $A^TY$ are independent. Now using the useful result, we have that $(A^TY)_i\sim N(0,\sigma^2)$ for $i=1,\dots,n-p-1$. The chi-square distribution with $n-p-1$ degrees of freedom for the sum of squared errors follows immediately.

Solved – sum of noncentral Chi-square random variables

As Glen_b noted in the comments, if the variances are all the same you end up with a scaled noncentral chi-squared.

If not, there is a concept of a generalized chi-squared distribution, i.e. $x^T A x$ for $x \sim N(\mu, \Sigma)$ and $A$ fixed. In this case, you have the special case of diagonal $\Sigma$ ($\Sigma_{ii} = \sigma_i^2$), and $A = I$.

There has been some work on computing things with this distribution:

Imhof (1961) and Davies (1980) numerically invert the characteristic function.
Sheil and O'Muircheartaigh (1977) write the distribution as an infinite sum of central chi-squared variables.
Kuonen (1999) gives a saddlepoint approximation to the pdf/cdf.
Liu, Tang and Zhang (2009) approximate it with a noncentral chi-squared distribution based on cumulant matching.

You can also write it as a linear combination of independent noncentral chi-squared variables $Y = \sum_{i=1}^n \sigma_i^2 \left( \frac{X_i^2}{\sigma_i^2} \right)$, in which case:

Castaño-Martínez and López-Blázquez (2005) give a Laguerre expansion for the pdf/cdf.

Bausch (2013) gives a more computationally efficient algorithm for the linear combination of central chi-squareds; his work might be extensible to noncentral chi-squareds, and you might find some interesting pointers in the related work section.

Best Answer

Related Solutions

Regression – Distribution of Sum of Squares Error

Solved – sum of noncentral Chi-square random variables

Related Question