Statistics – Understanding F Distribution of Two Independent Chi Square Distributions

chi squaredprobability distributionsstatistical-inferencestatistics

So the F distribution can be represented as the ratio of two independent chi-square distributions divided by their respective degrees of freedom.

I understand this, and why the F-distribution is so versatile can be used for so many tests. However, it just crossed my mind today that when testing if one regression model is significantly different from a null model (or really any alternative model), do we no longer have two independent chi-square distributions? Why can we still assume these chi-square distributions built with the same data sets under different model assumptions are independent?

I'm probably missing something obvious, sorry I am a little rusty.

Best Answer

Remember every hypothesis test is carried out under the assumption that the null model is correct; any chi-square distributions that may arise (or even exist) under the alternative model are not even considered. So it's not correct that we are assuming the independence of chi-square distributions built under different model assumptions.

What we do in the typical 'lack-of-fit'-type hypothesis test, where we compare the null model to a more complex alternative model, is we decompose some kind of sum of squares expression into a sum of two or more other sums of squares (example: residual sum of squares = pure-error sum of squares + lack-of-fit sum of squares); it's the guys on the RHS of this decomposition who are asserted to be independent and have chi-square distribution. The justification of this assertion is usually achieved through the magic of Cochran's Theorem.

Related Question