[Math] Proof for Chi-Squared Contingency tables

statistics

For Chi-Squared test on contingency tables there is a proof to get from:
$\sum\frac{(O_i – E_i)^2}{E_i}$ which equals $\frac{N(ad-bc)^2}{(a+b)(c+d)(a+c)(b+d)}$

Can anyone explain the steps in the proof i know how to get from one to other but not sure why certain steps happen!

Below ill put the proof if anyone wants to see it or can explain it?

Thanks

Best Answer

Same way, but done by human:

Starting with $\chi^2=\sum^4_{i=1}\frac{(o_i-e_i)^2}{e_i}$, and expand the square: $$\chi^2=\sum^4_{i=1}\frac{(o_i-e_i)^2}{e_i}=\sum^4_{i=1}\bigl(\frac{o_i^2}{e_i}-2o_i+e_i\bigr)=\sum^4_{i=1}\frac{o_i^2}{e_i}-n$$ (since $\sum o_i=\sum e_i =n$)

Substitute the values in: $$\chi^2=\frac{na^2}{(a+c)(a+b)}-a+\frac{nb^2}{(b+d)(a+b)}-b+\frac{nc^2}{(a+c)(c+d)}-c+\frac{nd^2}{(c+d)(b+d)}-d$$

Notice: $$an-(a+c)(a+b)=a(a+b+c+d)-(a+c)(a+b)=ad-bc$$

So, $$\frac{a}{(a+c)(a+b)}[na-(a+c)(a+b)]=\frac{a(ad-bc)}{(a+c)(a+b)}$$

Do the same thing for b,c,d.


It now gets fairly staight forward

Related Question