Probability – Understanding the Chi-Squared Test and Distribution

I am trying to understand the logic behind chi-squared test.

The Chi-squared test is $\chi ^2 = \sum \frac{(obs-exp)^2}{exp}$. $\chi ^2$ is then compared to a Chi-squared distribution to find out a p.value in order to reject or not the null hypothesis. $H_0$: the observations come from the distribution we used to created our expected values. For example, we could test if the probability of obtaining head is given by $p$ as we expect. So we flip 100 times and find $n_H$ Heads and $1-n_H$ tails. We want to compare our finding to what is expected ($100 \cdot p$). We could as well use a binomial distribution but it is not the point of the question… The question is:

Can you please explain why, under the null hypothesis, $\sum \frac{(obs-exp)^2}{exp}$ follows a chi-squared distribution?

All I know about the Chi-squared distribution is that the chi-squared distribution of degree $k$ is the sum of $k$ squared standard normal distribution.

Nevertheless, it is our starting point even for your actual question. I'll cover it somewhat informally.

Let's consider with the binomial case more generally:

$Y\sim \text{Bin}(n,p)$

Assume $n$ and $p$ are such that $Y$ is well approximated by a normal with the same mean and variance (some typical requirements are that $\min(np,n(1-p))$ is not small, or that $np(1-p)$ is not small).

Then $(Y-E(Y))^2/\text{Var}(Y)$ will be approximately $\sim\chi^2_1$. Here $Y$ is the number of successes.

We have $E(Y) = np$ and $\text{Var}(Y)=np(1-p)$.

(In the testing case, $n$ is known and $p$ is specified under $H_0$. We don't do any estimation.)

So if $H_0$ is true $(Y-np)^2/np(1-p)$ will be approximately $\sim\chi^2_1$.

Note that $(Y-np)^2 = [(n-Y)-n(1-p)]^2$. Also note that $\frac{1}{p} + \frac{1}{1-p} = \frac{1}{p(1-p)}$.

Hence $\frac{(Y-np)^2}{np(1-p)} = \frac{(Y-np)^2}{np}+\frac{(Y-np)^2}{n(1-p)}\\ \quad= \frac{(Y-np)^2}{np}+\frac{[(n-Y)-n(1-p)]^2}{n(1-p)} \\ \quad= \frac{(O_S-E_S)^2}{E_S}+\frac{(O_F-E_F)^2}{E_F}$

Which is just the chi-square statistic for the binomial case.

So in that case the chi-square statistic should have the distribution of the square of an (approximately) standard-normal random variable.

