Solved – Distribution of difference of two random variables with chi-squared distribution

chi-squared-distributiondistributions

Supose that we have two random variables $X \sim \chi_k^2$ and $Y \sim \chi_k^2$, with the same degrees of freedom.

A chi-squared distribution cannot have zero degrees of freedom, so what would be the distribution of $X – Y$ ?

Best Answer

This is not a chi-squared density, $X-Y$ will have support on $(-\infty, +\infty)$.

If the two variables are independent, it has mean 0 and variance $4k$.

If $k$ is large, its density is well approximated a normal variable with 0 mean and $4k$ variance. In the general case, its MGF has a closed form: $$E(\exp(t(X-Y))) = (1-4t^2)^{-k/2}$$

If the 2 variables have dependence, the nature of that dependence needs to be explicited.

Related Solutions

Solved – How does Pearson’s Chi Squared Statistic approximate a Chi Squared Distribution

I'm going to motivate this intuitively, and indicate how it comes about for the special case of two groups, assuming you're happy to accept the normal approximation to the binomial.

Hopefully that will be enough for you to get a good sense of why it works the way it does.

You're talking about the chi-square goodness of fit test. Let's say there are $k$ groups (you have it as $n$, but there's a reason I tend to prefer to call it $k$).

In the model being applied for this situation, the counts $O_i$, $i=1,2,...,k$ are multinomial.

Let $N=\sum_{i=1}^k O_i$. The counts are conditioned on the sum $N$ (except in some fairly rare situations); and there are some prespecified set of probabilities for each category, $p_i, i=1, 2, \ldots,k$, which sum to $1$.

Just as with the binomial, there's an asymptotic normal approximation for multinomials -- indeed, if you consider only the count in a given cell ("in this category" or not), it would then be binomial. Just as with the binomial, the variances of the counts (as well as their covariances in the multinomial) are functions of $N$ and the $p$'s; you don't estimate a variance separately.

That is, if the expected counts are sufficiently large, the vector of counts is approximately normal with mean $E_i=Np_i$. However, because the counts are conditioned on $N$, the distribution is degenerate (it exists in a hyperplane of dimension $k-1$, since specifying $k-1$ of the counts fixes the remaining one). The variance-covariance matrix has diagonal entries $Np_i(1-p_i)$ and off diagonal elements $-Np_ip_j$, and it is of rank $k-1$ because of the degeneracy.

As a result, for an individual cell $\text{Var}(O_i)=Np_i(1-p_i)$, and you could write $z_i = \frac{O_i-E_i}{\sqrt{E_i(1-p_i)}}$. However, the terms are dependent (negatively correlated), so if you sum the squares of those $z_i$ it won't have the a $\chi^2_k$ distribution (as it would if they were independent standardized variables). Instead we could potentially construct a set of $k-1$ independent variables from the original $k$ which are independent and still approximately normal (asymptotically normal). If we summed their (standardized) squares, we'd get a $\chi^2_{k-1}$. There are ways to construct such a set of $k-1$ variables explicitly, but fortunately there's a very neat shortcut that avoids what amounts to a substantial amount of effort, and yields the same result (the same value of the statistic) as if we had gone to the trouble.

Consider, for simplicity, a goodness of fit with two categories (which is now binomial). The probability of being in the first cell is $p_1=p$, and in the second cell is $p_2=1-p$. There are $X = O_1$ observations in the first cell, and $N-X=O_2$ in the second cell.

The observed first cell count, $X$ is asymptotically $\text{N}(Np,Np(1-p))$. We can standardize it as $z=\frac{X-Np}{\sqrt{Np(1-p)}}$. Then $z^2 = \frac{(X-Np)^2}{Np(1-p)}$ is approximately $\sim \chi^2_1$ (asymptotically $\sim \chi^2_1$).

Notice that

$\sum_{i=1}^2 \frac{(O_i-E_i)^2}{E_i} = \frac{[X-Np]^2}{Np}+ \frac{[(N-X)-(N-Np)]^2}{N(1-p)}= \frac{[X-Np]^2}{Np}+ \frac{[X-Np]^2}{N(1-p)}=(X-Np)^2[\frac{1}{Np}+ \frac{1}{N(1-p)}]$.

But

$\frac{1}{Np}+ \frac{1}{N(1-p)} =\frac{Np+N(1-p)}{Np.N(1-p)} = \frac{1}{Np(1-p)}$.

So $\sum_{i=1}^2 \frac{(O_i-E_i)^2}{E_i} =\frac{(X-Np)^2}{Np(1-p)}$ which is the $z^2$ we started with - which asymptotically will be a $\chi^2_1$ random variable. The dependence between the two cells is such that by diving by $E_i$ instead of $E_i(1-p_i)$ we exactly compensate for the dependence between the two, and get the original square-of-an-approximately-normal random variable.

The same kind of sum-dependence is taken care of by the same approach when there are more than two categories -- by summing the $\frac{(O_i-E_i)^2}{E_i}$ instead of $\frac{(O_i-E_i)^2}{E_i(1-p_i)}$ over all $k$ terms, you exactly compensate for the effect of the dependence, and obtain a sum equivalent to a sum of $k-1$ independent normals.

There are a variety of ways to show the statistic has a distribution that asymptotically $\chi^2_{k-1}$ for larger $k$ (it's covered in some undergraduate statistics courses, and can be found in a number of undergraduate-level texts), but I don't want to lead you too far beyond the level your question suggests. Indeed derivations are easy to find in notes on the internet, for example there are two different derivations in the space of about two pages here

Statistical Distributions – Relationship Between Gamma and Chi-Squared Distributions

Some background

The $\chi^2_n$ distribution is defined as the distribution that results from summing the squares of $n$ independent random variables $\mathcal{N}(0,1)$, so: $$\text{If }X_1,\ldots,X_n\sim\mathcal{N}(0,1)\text{ and are independent, then }Y_1=\sum_{i=1}^nX_i^2\sim \chi^2_n,$$ where $X\sim Y$ denotes that the random variables $X$ and $Y$ have the same distribution (EDIT: $\chi_n^2$ will denote both a Chi squared distribution with $n$ degrees of freedom and a random variable with such distribution). Now, the pdf of the $\chi^2_n$ distribution is $$ f_{\chi^2}(x;n)=\frac{1}{2^\frac{n}{2}\Gamma\left(\frac{n}{2}\right)}x^{\frac{n}{2}-1}e^{-\frac{x}{2}},\quad \text{for } x\geq0\text{ (and $0$ otherwise).} $$ So, indeed the $\chi^2_n$ distribution is a particular case of the $\Gamma(p,a)$ distribution with pdf $$ f_\Gamma(x;a,p)=\frac{1}{a^p\Gamma(p)}x^{p-1}e^{-\frac{x}{a}},\quad \text{for } x\geq0\text{ (and $0$ otherwise).} $$ Now it is clear that $\chi_n^2\sim\Gamma\left(\frac{n}{2},2\right)$.

Your case

The difference in your case is that you have normal variables $X_i$ with common variances $\sigma^2\neq1$. But a similar distribution arises in that case: $$Y_2=\sum_{i=1}^nX_i^2=\sigma^2\sum_{i=1}^n\left(\frac{X_i}{\sigma}\right)^2\sim\sigma^2\chi_n^2,$$ so $Y$ follows the distribution resulting from multiplying a $\chi_n^2$ random variable with $\sigma^2$. This is easily obtained with a transformation of random variables ($Y_2=\sigma^2Y_1$): $$ f_{\sigma^2\chi^2}(x;n)=f_{\chi^2}\left(\frac{x}{\sigma^2};n\right)\frac{1}{\sigma^2}. $$ Note that this is the same as saying that $Y_2\sim\Gamma\left(\frac{n}{2},2\sigma^2\right)$ since $\sigma^2$ can be absorbed by the Gamma's $a$ parameter.

Note

If you want to derive the pdf of the $\chi^2_n$ from scratch (which also applies to the situation with $\sigma^2\neq1$ under minor changes), you can follow the first step here for the $\chi_1^2$ using standard transformation for random variables. Then, you may either follow the next steps or shorten the proof relying in the convolution properties of the Gamma distribution and its relationship with the $\chi^2_n$ described above.