Solved – Post hoc test for Fisher’s exact test (larger than 2×2)

fishers-exact-testpost-hocsas

I have 2 categorical variables: Var1 – 3 levels, Var2- 5 levels (i.e., table is bigger than 2×2). I'm using SAS, and since I have multiple cells <5, I have run a Fisher's exact test with Monte Carlo estimation for the p value (code and output is below). I understand that if I want to determine which specific cells are not associated, I need to perform a post hoc test, but I haven't been able to determine if there is an appropriate post hoc test for Fisher's exact tests? Any help or insight would be very much appreciated.

Best Answer

Comment continued:

Here is a contingency table (fake data) with some small counts; I can use it to illustrate some of the statements in my Comment. [I am using R.]

TBL = rbind(c( 3, 12,  3, 20, 10),
            c(10, 15,  5, 10,  3))
TBL
     [,1] [,2] [,3] [,4] [,5]
[1,]    3   12    3   20   10
[2,]   10   15    5   10    3

A chi-squared test on this table may have an incorrect P-value due to small counts:

chisq.test(TBL)

        Pearson's Chi-squared test

data:  TBL
X-squared = 11.465, df = 4, p-value = 0.02181

Warning message:
In chisq.test(TBL) : Chi-squared approximation 
may be incorrect

Specifically, expected counts in col 3 may be too small.

chisq.test(TBL)$exp
         [,1]     [,2]    [,3]     [,4]     [,5]
[1,] 6.857143 14.24176 4.21978 15.82418 6.857143
[2,] 6.142857 12.75824 3.78022 14.17582 6.142857
Warning message:
In chisq.test(TBL) : Chi-squared approximation 
may be incorrect

However, Pearson residuals point to columns 1 and 5 as possibly worth a closer look--provided the whole table turns out to be significant.

chisq.test(TBL)$resi
          [,1]       [,2]       [,3]      [,4]      [,5]
[1,] -1.472971 -0.5940281 -0.5937952  1.049740  1.200198
[2,]  1.556254  0.6276151  0.6273690 -1.109093 -1.268059
Warning message:
In chisq.test(TBL) : Chi-squared approximation 
may be incorrect

We use simulation to obtain a more trustworthey P-value, leading to rejection of the null hypothesis.

chisq.test(TBL, sim = T)

        Pearson's Chi-squared test 
        with simulated p-value 
        (based on 2000 replicates)

data:  TBL
X-squared = 11.465, df = NA, p-value = 0.01849

Fisher's Exact Test on the $2 \times 5$ table gives roughly the same P-value.

fisher.test(TBL)

        Fisher's Exact Test for Count Data

data:  TBL
p-value = 0.02156
alternative hypothesis: two.sided

So we decide there are significant departures from independence (or homogeneity) and that this significance may be partly due to columns 1 and 5.

TB.15 = TBL[,c(1,5)];  TB.15
     [,1] [,2]
[1,]    3   10
[2,]   10    3
chisq.test(TB.15, sim=T)

        Pearson's Chi-squared test 
        with simulated p-value 
        (based on 2000 replicates)

data:  TB.15
X-squared = 7.5385, df = NA, p-value = 0.01899

Alternatively, we might use Fisher's exact tests ad hoc to the Fisher test on the whole table (especially, if we did not need suggestions from Pearson residuals to help decide which ad hoc tests are of interest).

fisher.test(TB.15)

        Fisher's Exact Test for Count Data

data:  TB.15
p-value = 0.01693
alternative hypothesis: 
   true odds ratio is not equal to 1
95 percent confidence interval:
   0.009851234 0.720962703
sample estimates:
odds ratio 
 0.1011654

Notes: (1) If we wanted to look at several such ad hoc tests, we should use some method (such as Bonferroni's) to avoid 'false discovery'.

(2) It is best to choose either chi-squared tests (possibly with simulated P-values) or Fisher exact tests for use throughout the analysis--possibly stating a rationale for the choice. [It is not fair to run all the tests and choose to report the ones with the smaller P-values.]

(3) If your tables have a large proportion of expected counts below 5 (and especially below 3), and if SAS does not do simulated P-values for chi-squared tests on sparse tables, I would recommend Fisher exact tests. I don't know the context of your work. However, if it will be reviewed for publication or by government regulators, you may run into preference for strict observance of the rule that expected counts should exceed 5 with few minor exceptions.

Related Solutions

Solved – Fisher’s exact test

The massive 58 amid much lower frequencies signals that any test is just quantifying a major failure of independence. I did this in Stata. The command ret li (short for return list) obliges Stata to show results as exactly as it knows them, but both tests yield P-values that are 0.000 to 3 d.p. It is right to be a little cautious about low expected values (for row 1 here in particular) but the test results are overwhelming.

. tabi 0  2 \ 5 58 \ 4 3 \ 4 3 

            |          col
        row |         1          2 |     Total
 -----------+----------------------+----------
          1 |         0          2 |         2 
          2 |         5         58 |        63 
          3 |         4          3 |         7 
          4 |         4          3 |         7 
 -----------+----------------------+----------
      Total |        13         66 |        79 

      Pearson chi2(3) =  20.5779   Pr = 0.000

. ret li 

scalars:
              r(p) =  .0001288081813192
           r(chi2) =  20.57794057794058
              r(c) =  2
              r(r) =  4
              r(N) =  79

. tabi 0  2 \ 5 58 \ 4 3 \ 4 3 , exact

Enumerating sample-space combinations:
stage 4:  enumerations = 1
stage 3:  enumerations = 3
stage 2:  enumerations = 17
stage 1:  enumerations = 0

             |          col
         row |         1          2 |     Total
  -----------+----------------------+----------
           1 |         0          2 |         2 
           2 |         5         58 |        63 
           3 |         4          3 |         7 
           4 |         4          3 |         7 
  -----------+----------------------+----------
       Total |        13         66 |        79 

       Fisher's exact =                 0.000

. ret li 

scalars:
        r(p_exact) =  .0003124258226793
              r(c) =  2
              r(r) =  4
              r(N) =  79

Solved – Post hoc $\chi^2$ test with R

I like this question because too often, people do omnibus tests and then don't ask more specific questions about what is happening.

If the goal is to compare "treatments" a, b, and c, I would suggest summarizing the data showing the percentages within each column, so you can see more clearly how they differ. Then to test these comparisons, one simple idea is to do the $\chi^2$ test on each pair of columns:

> for (j in 1:3) print(chisq.test(mat[, -j]))

    Pearson's Chi-squared test

data:  mat[, -j]
X-squared = 0.1542, df = 2, p-value = 0.9258


    Pearson's Chi-squared test

data:  mat[, -j]
X-squared = 4.5868, df = 2, p-value = 0.1009


    Pearson's Chi-squared test

data:  mat[, -j]
X-squared = 9.5653, df = 2, p-value = 0.008374

Since 3 tests are done, a Bonferroni correction is advised (multiply each $P$ value by 3). The last test, where column 3 is omitted, has a very low $P$ value, so you can conclude that the distributions of (good, fair, poor) are different for conditions a and b. Note, however, that condition c does not have much data, and that's largely why the other two results are nonsignificant.

You could use a similar strategy to do pairwise comparisons of the rows.

Best Answer

Related Solutions

Solved – Fisher’s exact test

Solved – Post hoc $\chi^2$ test with R

Related Question