[Math] Why the chi-squared statistic follows chi-squared distribution

chi squaredprobability distributionsstatistics

The formula for the Chi-Square test statistic is the following:

$$\chi^2=\sum_{i=1}^n\frac{({O_i-E_i})^2}{E_i}$$

where $O_i$ is observed data, and $E_i$ is expected.

I am just curious why this follows the $\chi^2$ distribution?

Best Answer

It is $\frac{O_i-E_i}{E_i}$ that follows a normal distribution not its square root. We are just assuming that relative errors are Gaussian. It is just an assumption. The square of the gaussian variable ~ Gamma Distribution (Chi-square). The sum of the these squared variables follow a Chi-square with $n$ degrees of freedom. Now if we were to look at absolute values , namely $\left|\frac{O_i-E_i}{E_i}\right|$ , we would have a half-normal distribution and the sum $\sum_{i=1}^n\left|\frac{O_i-E_i}{E_i}\right|$would end up converging to a Gaussian, though unlike Chi-Square, without a clearly standard distribution for $n$ small.

Related Question