Solved – Fishers exact test meaning of “greater” and “less”

fishers-exact-testproportion;

I'm trying to understand intuitively a one-tailed test on Fisher's exact test.

lets' say I have the voting record of Republicans (1) and Democrats (2) on gun legislation. This is the information given

           progun antigun
      1      3       1
      2      1       3

      fisher.test(data.frame(progun,antigun),alternative="greater")

"greater" would include getting the given table above and one that shows a greater level of association between party and voting on gun legislation such as the one below

             progun antigun
         1      4       0
         2      0       4  

But what does it mean in regards to the association when we have a situation that is "lesser"?

fisher.test(data.frame(progun,antigun),alternative="less")

also I get a p value of 0.98. why does it seem to be such a high p-value

Best Answer

The Fisher exact test conditions on the margins (they don't have to be fixed, necessarily; Fisher would argue that the margins are almost ancillary, though in some situations this argument might be a bit harder to maintain). The row and columns margins are both (4,4). The possible tables with those margins are:

| 0  4 |    | 1  3 |    | 2  2 |     | 3  1 |    | 4  0 |
| 4  0 |    | 3  1 |    | 2  2 |     | 1  3 |    | 0  4 |

Note that these tables are arranged from the most strong negative association on the left to the most strong positive association on the right.

If there's no association, the distribution of any given cell (say the top left cell) is hypergeometric. The probabilities associated with each table (under the null hypothesis) are:

   0           1           2            3           4
  1/70       16/70       36/70        16/70        1/70

where the first row is the top left cell of the table and the second row are the probabilities.

One tailed tests either look at the left end or the right end of this distribution; a two tailed test looks at both ends.

While Fisher typically didn't give explicit alternatives and might not have made a one-tailed distinction in this situation, meaning that he might perhaps just have looked at all tables with equal-or-lower probability ($p=\frac12$ in this instance), we can quite reasonably specify a one-tailed exact test. So we can take a Neyman-Pearson framework within the setup of a Fisher exact-test (condition on the margins, use hypergeometric probabilities); just as R is happy to do.

A p-value in that framework is the probability of being at least as extreme as the observed table in the direction of the alternative. If you have the table with top left element 3, i.e.:

3  1       
1  3         

then the "greater" direction is to the right and the "lesser" direction is to the left from that table.

Plot of the hypergeometric distribution whose numerical values are given above, with the values at or below 3 indicated

The probability of this table and all tables to its left is (1+16+36+16)/70 = 69/70 = 0.9857... so that's the p-value for the "lesser" direction (less positive association).

It's high because you have a table showing positive association -- almost all the probability distribution is there or further left; only 1/70 of the null distribution is to the right. That is, the alternative is in the "wrong direction" relative to what the data indicate, much like having a z-test for a mean where say z=2.19 but you were doing a one-tailed test where the alternative was below the hypothesized mean, not above it.

Related Question