Solved – Fisher’s exact test or chi-square test

categorical datachi-squared-testfishers-exact-teststatistical significance

I have a 2×4 table with nominal data (the columns are simply YES/NO, the rows are four categories)

Category A: 7, 13
Category B: 15, 5
Category C: 15, 5
Category D: 19, 1 

I am hoping to test the significance of a couple of the categories to each other (2×2), but also to assess the significance of the whole table (2×4, although I am not really sure how to interpret the meaning of the significance that may result here).

I understand that as my sample sizes are small (as you can see, Category D features "1" under NO, and each category has only 20 people) that I should be using Fisher's Exact Test. Is this correct?

Can Fisher's also be used in a 2×4? And what does it mean if this result is significant?

Best Answer

It takes more time to post the question than to try it out. Here is Stata:

. tabi  7  13 \ 15 5 \ 15 5 \ 19 1 , exact

Enumerating sample-space combinations:
stage 4:  enumerations = 1
stage 3:  enumerations = 14
stage 2:  enumerations = 65
stage 1:  enumerations = 0

            |          col
        row |         1          2 |     Total
 -----------+----------------------+----------
          1 |         7         13 |        20 
          2 |        15          5 |        20 
          3 |        15          5 |        20 
          4 |        19          1 |        20 
 -----------+----------------------+----------
      Total |        56         24 |        80 

       Fisher's exact =                 0.000

. ret li

scalars:
        r(p_exact) =  .000426720882576
              r(c) =  2
              r(r) =  4
              r(N) =  80

You could report the P-value as 0.0004 or 0.00043, say. So, Fisher's test can be done for tables this size. A standard chi-square test (not shown here) gives a P-value of 0.00042, which every statistical person I know would regard as essentially identical. The tests support the interpretation that is evident from eyeballing the table of an association between row and column variables.

Related Question