Solved – Post hoc test for Fisher’s exact test (larger than 2×2)

fishers-exact-testpost-hocsas

I have 2 categorical variables: Var1 – 3 levels, Var2- 5 levels (i.e., table is bigger than 2×2). I'm using SAS, and since I have multiple cells <5, I have run a Fisher's exact test with Monte Carlo estimation for the p value (code and output is below). I understand that if I want to determine which specific cells are not associated, I need to perform a post hoc test, but I haven't been able to determine if there is an appropriate post hoc test for Fisher's exact tests? Any help or insight would be very much appreciated.

Best Answer

Comment continued:

Here is a contingency table (fake data) with some small counts; I can use it to illustrate some of the statements in my Comment. [I am using R.]

TBL = rbind(c( 3, 12,  3, 20, 10),
            c(10, 15,  5, 10,  3))
TBL
     [,1] [,2] [,3] [,4] [,5]
[1,]    3   12    3   20   10
[2,]   10   15    5   10    3

A chi-squared test on this table may have an incorrect P-value due to small counts:

chisq.test(TBL)

        Pearson's Chi-squared test

data:  TBL
X-squared = 11.465, df = 4, p-value = 0.02181

Warning message:
In chisq.test(TBL) : Chi-squared approximation 
may be incorrect

Specifically, expected counts in col 3 may be too small.

chisq.test(TBL)$exp
         [,1]     [,2]    [,3]     [,4]     [,5]
[1,] 6.857143 14.24176 4.21978 15.82418 6.857143
[2,] 6.142857 12.75824 3.78022 14.17582 6.142857
Warning message:
In chisq.test(TBL) : Chi-squared approximation 
may be incorrect

However, Pearson residuals point to columns 1 and 5 as possibly worth a closer look--provided the whole table turns out to be significant.

chisq.test(TBL)$resi
          [,1]       [,2]       [,3]      [,4]      [,5]
[1,] -1.472971 -0.5940281 -0.5937952  1.049740  1.200198
[2,]  1.556254  0.6276151  0.6273690 -1.109093 -1.268059
Warning message:
In chisq.test(TBL) : Chi-squared approximation 
may be incorrect

We use simulation to obtain a more trustworthey P-value, leading to rejection of the null hypothesis.

chisq.test(TBL, sim = T)

        Pearson's Chi-squared test 
        with simulated p-value 
        (based on 2000 replicates)

data:  TBL
X-squared = 11.465, df = NA, p-value = 0.01849

Fisher's Exact Test on the $2 \times 5$ table gives roughly the same P-value.

fisher.test(TBL)

        Fisher's Exact Test for Count Data

data:  TBL
p-value = 0.02156
alternative hypothesis: two.sided

So we decide there are significant departures from independence (or homogeneity) and that this significance may be partly due to columns 1 and 5.

TB.15 = TBL[,c(1,5)];  TB.15
     [,1] [,2]
[1,]    3   10
[2,]   10    3
chisq.test(TB.15, sim=T)

        Pearson's Chi-squared test 
        with simulated p-value 
        (based on 2000 replicates)

data:  TB.15
X-squared = 7.5385, df = NA, p-value = 0.01899

Alternatively, we might use Fisher's exact tests ad hoc to the Fisher test on the whole table (especially, if we did not need suggestions from Pearson residuals to help decide which ad hoc tests are of interest).

fisher.test(TB.15)

        Fisher's Exact Test for Count Data

data:  TB.15
p-value = 0.01693
alternative hypothesis: 
   true odds ratio is not equal to 1
95 percent confidence interval:
   0.009851234 0.720962703
sample estimates:
odds ratio 
 0.1011654 

Notes: (1) If we wanted to look at several such ad hoc tests, we should use some method (such as Bonferroni's) to avoid 'false discovery'.

(2) It is best to choose either chi-squared tests (possibly with simulated P-values) or Fisher exact tests for use throughout the analysis--possibly stating a rationale for the choice. [It is not fair to run all the tests and choose to report the ones with the smaller P-values.]

(3) If your tables have a large proportion of expected counts below 5 (and especially below 3), and if SAS does not do simulated P-values for chi-squared tests on sparse tables, I would recommend Fisher exact tests. I don't know the context of your work. However, if it will be reviewed for publication or by government regulators, you may run into preference for strict observance of the rule that expected counts should exceed 5 with few minor exceptions.