Explain why a chi-square random variable having n degrees of freedom will approximately have the distribution of a normal random variable when n is large.
[Math] Explain why a chi-square random variable will approximately have a normal distribution for large n
probabilityprobability distributionsstatistics
Related Solutions
As Robert Israel has pointed out, the sum of squares of $n$ independent random variables with a standard normal distribution has a chi-square distribution with $n$ degrees of freedom.
Take them from a normal distribution whose expectation is $\mu$ and whose standard deviation is $\sigma$, you have have $$ \left(\frac{X_1-\mu}{\sigma}\right)^2 + \cdots + \left(\frac{X_n-\mu}{\sigma}\right)^2 $$ has chi-square distribution with $n$ degrees of freedom.
So why might it appear that one of them is not counted? The answer to that comes from such results as this: Suppose instead of the population mean $\mu$, you subtract the sample mean $\overline X$. Then you have $$ \left(\frac{X_1-\overline X}{\sigma}\right)^2 + \cdots + \left(\frac{X_n-\overline X}{\sigma}\right)^2,\tag{1} $$ and this has a chi-square distribution with $n-1$ degrees of freedom. In particular, if $n=1$, then the sample mean is just the same as $X_1$, so the numerator in the first term is $X_1-X_1$, and the sum is necessarily $0$, so you have a chi-square distribution with $0$ degrees of freedom.
Notice that in $(1)$, you have $n$ terms in the sum, not $n-1$, and they're not independent (since if you take away the exponents, you get $n$ terms that necessarily always add up to $0$) and the standard deviation of the fraction that gets squared is not actually $1$, but less than $1$. So why does it have the same probability distribution as if there were $n-1$ of them, and they were indepedent, and those standard deviations were each $1$? The simplest way to answer that may be this: $$ \begin{bmatrix} X_1 \\ \vdots \\ X_n \end{bmatrix} = \begin{bmatrix} \overline X \\ \vdots \\ \overline X \end{bmatrix} + \begin{bmatrix} X_1 - \overline X \\ \vdots \\ X_n - \overline X \end{bmatrix} $$ This is the decomposition of a vector into two components orthogonal to each other: one in a $1$-dimensional space and the other in an $n-1$ dimensional space. Now think about the spherical symmetry of the joint probability distribution, and about the fact that the second projection maps the expected value of the random vector to $0$.
Later edit: Sometimes it might seem as if two of them are not counted. Suppose $X_i$ is a normally distributed random variable with expected value $\alpha+\beta w_i$ and variance $\sigma^2$, and they're independent, for $i=1,\ldots,n$. When $w_i$ is observable and $\alpha$, $\beta$, are not, one may use least-squares estimates $\hat\alpha$, $\hat\beta$. Then $$ \left(\frac{X_1-(\alpha+\beta w_1)}{\sigma}\right)^2 + \cdots + \left(\frac{X_n-(\alpha+\beta w_n)}{\sigma}\right)^2 \sim \chi^2_n $$ but $$ \left(\frac{X_1-(\hat\alpha+\hat\beta w_1)}{\sigma}\right)^2 + \cdots + \left(\frac{X_n-(\hat\alpha+\hat\beta w_n)}{\sigma}\right)^2 \sim \chi^2_{n-2}. $$ A similar sort of argument involving orthogonal projections explains this.
One needs these results in order to derive things like confidence intervals for $\mu$, $\alpha$, and $\beta$.
HINTS: Start with the moment generating function of $X \sim \chi^2(n)$: $$ \mathcal{M}_X\left(t\right) = \left(1-2 t\right)^{-n/2} $$ Now, notice that for the sum of $m$ independent random variables $Z = X_1 + X_2 + \cdots+X_n$, the moment generating function is $$ \mathcal{M}_Z(t) = \mathbb{E}\left(\mathrm{e}^{Z t}\right) = \mathbb{E}\left(\mathrm{e}^{\left( X_1 + X_2 + \cdots+X_n\right) t}\right) = \mathbb{E}\left(\mathrm{e}^{X_1 t} \cdot \mathrm{e}^{X_2 t} \cdots\mathrm{e}^{X_n t}\right) $$ Using independence: $$ \mathcal{M}_Z(t) = \prod_{k=1}^n \mathbb{E}\left(\mathrm{e}^{X_k t} \right) = \prod_{k=1}^n \mathcal{M}_{X_k}(t) $$ Now piece these two facts together, and notice that the functional form of $\mathcal{M}_Z(t)$ is that of a $\chi^2$ distribution, and read off its degrees of freedom parameter.
Best Answer
Because a chi-square random variable $X_n$ with $n$ degrees of freedom is distributed like $\sum\limits_{k=1}^nY_k^2$ with $(Y_k)_{k\geqslant1}$ i.i.d. standard normal hence $X_n=n+\sqrt{2n}Z_n$ where $Z_n$ converges in distribution to a standard normal random variable when $n\to\infty$, by the usual central limit theorem applied to the sequence $(Y_k^2)_{k\geqslant1}$.