Fisher’s Exact Test in R – Analyzing a 2×4 Table

categorical datafishers-exact-testr

I have a question on categorical data –

I have a 2×4 matrix and I want to test the difference in the number of males or females in each group, and so I entered the data into R.

So I input this:

A=c(31,7)
B=c(8,1)
C=c(39,16)
D=c(2,6)
tab=as.table(cbind(A,B,C,D))
row.names(tab)=c('males','females')
fisher.test(tab)

This is the output that I get –

Fisher's Exact Test for Count Data

data:  tab 
p-value = 0.01077
alternative hypothesis: two.sided

The only statistical information I get is the p-value – how do I know how the groups are different?

Best Answer

In a sense this is analogous to a situation where you test for differences in group means with ANOVA and then perform a post hoc test, such as Tukey's HSD, to tell which groups are the ones that actually differ. But, there is no equivalent post hoc test for Fisher's test.

The only "post hoc" thing that comes to mind is to run all pairwise comparisons for the table, and correct the p-values accordingly with, e.g., the Bonferroni method.

For a Chi square test, you could check the residuals or simply the expected-observed counts. In addition, going throught the percentages of observations in each group would probably answer your question at least partly, and this could be used with either Fisher's or Chi square test.

In R these can be done as follows:

# Percentages for rows and columns
# These a higher proportion of females than males in group D 
prop.table(tab, 1) # rows
prop.table(tab, 2) # columns

# Chi square residuals
# The largest residuals are in the group D
chisq.test(tab)$residuals

# Chi square expected-observed
chisq.test(tab)$expected-chisq.test(tab)$observed

# Chi square "post hoc" test
# For Fisher you need to do this by hand
library(NCstats) # from rforge.net
chisqPostHoc(chisq.test(t(tab))) # for A-D
chisqPostHoc(chisq.test(tab)) # for gender

Related Solutions

Solved – Fisher’s exact test in 3×2 contingency table

It sounds like you are asking a lot of different questions here.

My question is: how should I interpret the p value? I don't understand what is that referred to.

The null hypothesis for Fisher's Exact test is that the groups do not affect the outcome, i.e. that they are independent. Rejection of the null hypothesis indicates the outcome (a, b, or c) is dependent on group.

fisher.test(matrix(c(2, 12, 1, 5, 3, 1), 
            nrow=2, ncol=3, byrow=TRUE))
Fisher's Exact Test for Count Data

data:  dta
p-value = 0.05082
alternative hypothesis: two.sided

In this case your $p$ value is approximately 0.05082. I will let you decide whether to reject the null.

Having the p value, how can I say that one of the three forms is statistically significant more represented than the others (if true)?

This is a separate question and I'm not sure what you are trying to ask.

Solved – Fisher’s exact test

The massive 58 amid much lower frequencies signals that any test is just quantifying a major failure of independence. I did this in Stata. The command ret li (short for return list) obliges Stata to show results as exactly as it knows them, but both tests yield P-values that are 0.000 to 3 d.p. It is right to be a little cautious about low expected values (for row 1 here in particular) but the test results are overwhelming.

. tabi 0  2 \ 5 58 \ 4 3 \ 4 3 

            |          col
        row |         1          2 |     Total
 -----------+----------------------+----------
          1 |         0          2 |         2 
          2 |         5         58 |        63 
          3 |         4          3 |         7 
          4 |         4          3 |         7 
 -----------+----------------------+----------
      Total |        13         66 |        79 

      Pearson chi2(3) =  20.5779   Pr = 0.000

. ret li 

scalars:
              r(p) =  .0001288081813192
           r(chi2) =  20.57794057794058
              r(c) =  2
              r(r) =  4
              r(N) =  79

. tabi 0  2 \ 5 58 \ 4 3 \ 4 3 , exact

Enumerating sample-space combinations:
stage 4:  enumerations = 1
stage 3:  enumerations = 3
stage 2:  enumerations = 17
stage 1:  enumerations = 0

             |          col
         row |         1          2 |     Total
  -----------+----------------------+----------
           1 |         0          2 |         2 
           2 |         5         58 |        63 
           3 |         4          3 |         7 
           4 |         4          3 |         7 
  -----------+----------------------+----------
       Total |        13         66 |        79 

       Fisher's exact =                 0.000

. ret li 

scalars:
        r(p_exact) =  .0003124258226793
              r(c) =  2
              r(r) =  4
              r(N) =  79

Best Answer

Related Solutions

Solved – Fisher’s exact test in 3×2 contingency table

Solved – Fisher’s exact test

Related Question