Solved – Fisher’s exact test or chi-square test

categorical datachi-squared-testfishers-exact-teststatistical significance

I have a 2×4 table with nominal data (the columns are simply YES/NO, the rows are four categories)

Category A: 7, 13
Category B: 15, 5
Category C: 15, 5
Category D: 19, 1

I am hoping to test the significance of a couple of the categories to each other (2×2), but also to assess the significance of the whole table (2×4, although I am not really sure how to interpret the meaning of the significance that may result here).

I understand that as my sample sizes are small (as you can see, Category D features "1" under NO, and each category has only 20 people) that I should be using Fisher's Exact Test. Is this correct?

Can Fisher's also be used in a 2×4? And what does it mean if this result is significant?

Best Answer

It takes more time to post the question than to try it out. Here is Stata:

. tabi  7  13 \ 15 5 \ 15 5 \ 19 1 , exact

Enumerating sample-space combinations:
stage 4:  enumerations = 1
stage 3:  enumerations = 14
stage 2:  enumerations = 65
stage 1:  enumerations = 0

            |          col
        row |         1          2 |     Total
 -----------+----------------------+----------
          1 |         7         13 |        20 
          2 |        15          5 |        20 
          3 |        15          5 |        20 
          4 |        19          1 |        20 
 -----------+----------------------+----------
      Total |        56         24 |        80 

       Fisher's exact =                 0.000

. ret li

scalars:
        r(p_exact) =  .000426720882576
              r(c) =  2
              r(r) =  4
              r(N) =  80

You could report the P-value as 0.0004 or 0.00043, say. So, Fisher's test can be done for tables this size. A standard chi-square test (not shown here) gives a P-value of 0.00042, which every statistical person I know would regard as essentially identical. The tests support the interpretation that is evident from eyeballing the table of an association between row and column variables.

Related Solutions

Solved – Fisher’s exact test

The massive 58 amid much lower frequencies signals that any test is just quantifying a major failure of independence. I did this in Stata. The command ret li (short for return list) obliges Stata to show results as exactly as it knows them, but both tests yield P-values that are 0.000 to 3 d.p. It is right to be a little cautious about low expected values (for row 1 here in particular) but the test results are overwhelming.

. tabi 0  2 \ 5 58 \ 4 3 \ 4 3 

            |          col
        row |         1          2 |     Total
 -----------+----------------------+----------
          1 |         0          2 |         2 
          2 |         5         58 |        63 
          3 |         4          3 |         7 
          4 |         4          3 |         7 
 -----------+----------------------+----------
      Total |        13         66 |        79 

      Pearson chi2(3) =  20.5779   Pr = 0.000

. ret li 

scalars:
              r(p) =  .0001288081813192
           r(chi2) =  20.57794057794058
              r(c) =  2
              r(r) =  4
              r(N) =  79

. tabi 0  2 \ 5 58 \ 4 3 \ 4 3 , exact

Enumerating sample-space combinations:
stage 4:  enumerations = 1
stage 3:  enumerations = 3
stage 2:  enumerations = 17
stage 1:  enumerations = 0

             |          col
         row |         1          2 |     Total
  -----------+----------------------+----------
           1 |         0          2 |         2 
           2 |         5         58 |        63 
           3 |         4          3 |         7 
           4 |         4          3 |         7 
  -----------+----------------------+----------
       Total |        13         66 |        79 

       Fisher's exact =                 0.000

. ret li 

scalars:
        r(p_exact) =  .0003124258226793
              r(c) =  2
              r(r) =  4
              r(N) =  79

Solved – Alternative for Fisher’s exact test for count data in table bigger than 2×2

I don't see anything about your problem that is non-standard for counts of categories. The only thing that is even a little unusual is that you have extremely marked differences between languages.

For your data I get Pearson chi-square of $687.8$ with $15$ d.f. for a test of no association between the variables and the P-value is minutely small. For what it's worth, my program (Stata) reports the P-value as about $7 \times 10^{-137}$.

A good program should indeed flag small expected frequencies, which are the issue rather than small observed frequencies: I see a flag that 4 cells have less than 1 as expected frequency. So, there is a bit of a worry about the P-value, but it is really quite secondary. You could change the P-value by more than 100 orders of magnitude either way, but the message would be the same.

To put it directly, a simple test underlines what is evident just by looking at the frequencies, namely that the languages are very different, which you know any way. If you have some sceptic who doubts that, then a chi-square test provides back-up.

Doing this with Fisher's test is on one level more correct statistically, but it will not change the practical or scientific conclusion one iota.

You have quantitative data that are pertinent to a discussion, but you don't need statistical inference to add gloss. The numbers speak eloquently for themselves, and the details are the interesting part.

Naturally, I am responding to your example, and being firm about what it implies in no way rules out different conclusions for other data.

If there is a predictive model that predicts actual (relative) frequencies, then testing that is a much more interesting question, but you would need to tell us the details.

To respond a little more directly to your question: Fisher's exact test often is impractical once the frequencies stop being very small.

Best Answer

Related Solutions

Solved – Fisher’s exact test

Solved – Alternative for Fisher’s exact test for count data in table bigger than 2×2

Related Question