Solved – Degrees of freedom for Chi-squared test

association-measurechi-squared-testdefinitiondistributionsstatistical significance

I am facing the following dilemma. I am aware of how to handle the one-sided Chi-squared distribution, but I am falling victim to how to handle degrees of freedom. Let me clarify with an example what I mean.

I have the following obseverd and expected values:

[Observed Data]

#Periods      CountryI   CountryII     CountryIII
#(1900-1950)     100      150            20
#(1951-2000)     59       160            50

[Expected DATA]

#Periods   country I     Country II       CountryIII        
#(1900-1950)  118.4         52                40
#(1951-2000)   80.5         90                25

My question is: Since this is a one sided-Chi square test,
are the degrees of freedom counted by the formula: (columns-1)(rows-1), in which case I would have $(6-1)(2-1) = 5$?

Or is that really just country1 country2 country3 that matters, so that d.f. would be 3-1=2?

Because d.f. is usually defined as the terms for the chi squared = 6, where we usually subtract 1 from it.

Please help me out with this one.

Best Answer

How many variables are present in your cross-classification will determine the degrees of freedom of your $\chi^2$-test. In your case, your are actually cross-classifying two variables (period and country) in a 2-by-3 table.

So the dof are $(2-1)\times (3-1)=2$ (see e.g., Pearson's chi-square test for justification of its computation). I don't see where you got the $6$ in your first formula, and your expected frequencies are not correct, unless I misunderstood your dataset.

A quick check in R gives me:

> my.tab <- matrix(c(100, 59, 150, 160, 20, 50), nc=3)
> my.tab
     [,1] [,2] [,3]
[1,]  100  150   20
[2,]   59  160   50
> chisq.test(my.tab)

    Pearson's Chi-squared test

data:  my.tab 
X-squared = 23.7503, df = 2, p-value = 6.961e-06

> chisq.test(my.tab)$expected
        [,1]     [,2]     [,3]
[1,] 79.6475 155.2876 35.06494
[2,] 79.3525 154.7124 34.93506
Related Question