Solved – 2×2 Fisher Exact Test Contingency Tables

contingency tablesfishers-exact-testr

I was comparing results for 2×2 contingency tables and I noticed that the fisher exact test in R was giving the same results for different contingency tables passed. For example:

cont_table_1=matrix(c(10,400,70,4000), nrow = 2)
#The contingency table:
#         [,1] [,2]
#    [1,]   10   70
#    [2,]  400 4000

fisher.test(cont_table_1)

    Fisher's Exact Test for Count Data

data:  matrix(c(10, 400, 70, 4000), nrow = 2)
p-value = 0.3236
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.6511445 2.8142941
sample estimates:
odds ratio 
1.428433 

cont_table_2=rbind(c(10,400),c(70,4000))
#The contingency table:
#         [,1] [,2]
#    [1,]   10  400
#    [2,]   70  4000

fisher.test(rbind(c(10,400),c(70,4000)))

Fisher's Exact Test for Count Data

data:  rbind(c(10, 400), c(70, 4000)) 
p-value = 0.3236
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.6511445 2.8142941
sample estimates:
odds ratio 
1.42843

My Question is: Why is the Fisher Exact test computing the exact same statistics for different orders of contingency tables? Will switching the order of the contingency table in this way always lead to the same FET statistics?

Best Answer

The Fisher's Exact Test is a test of the odds ratio. If the contingency table has values enumerated by:

$$\begin{array}{c|cc} & Y & \bar{Y} \\ \hline X & a & b \\ \bar{X} & c & d \end{array}$$

Then the odds ratio is given by $$OR = \dfrac{ad}{bc}$$

And you can see the OR is the same if the contingency table is "transposed", $OR = ad/(cb)$, i.e. the measure of association is exactly the same. Furthermore, the margin totals are exactly the same, so the possible permutations of the contingency table (that preserve those margin totals) are also the same, hence the OR, the p-values, and the inverted test confidence intervals for the OR should all be the same.

Switching the table values, or using any other table values, which give the same OR and have the same margin totals will always give the same FET results. This is somewhat restricted: you can only swap $b$ and $c$ and/or $a$ and $d$.

This underscores the strength of the OR for modeling associations in retrospective studies: the OR for exposure given disease is the same as the OR for disease given exposure. So you can sample in a case-control fashion, selecting rare outcomes and their controls and assessing exposure in a much smaller sample, and then obtain the same measure you would get in a complex, expensive longitudinal study.