Solved – fisher.test() and Chi-square bug in R

fishers-exact-testr

I'm getting this error when conducting fisher.test() in R:

freqTable = 
    1  2  3  4  5  6  7
 1 14  0  2  0  0  0  0
 2  0  0  0  9  0  0  0
 3  0  6  0  0  0  0  0
 4  0  0  0  0  6  0  0
 5  0  0  0  0  0 10  0
 6  0  0 10  0  0  0  0
 7  2  4  9  1  0  0 30

> fisher.test(freqTable, workspace=2e+07,hybrid=TRUE)$p.value
Error in fisher.test(freqTable, workspace = 2e+07, hybrid = TRUE) : 
  FEXACT error 30.
Stack length exceeded in f3xact.
This problem should not occur.

Chi-square is also problematic:

chisq.test(freqTable)

Pearson's Chi-squared test

data:  freqTable
X-squared = 466.81, df = 36, p-value < 2.2e-16

Warning message:
In chisq.test(freqTable) : Chi-squared approximation may be incorrect

Looks like there's an open bug report: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=1662

I'm wondering if anyone has found an appropriate workaround for these cases?

Thanks

Best Answer

I have now addressed this problem (and similar ones) in fisher.test(), and indeed fixed the bug (for "R-devel" only, see the bugs.r-project.org link above). Increasing 'workspace' now does help in fisher.test().... but this example takes a lot of time, if you run the exact test (default hybrid=FALSE):

On machines with enough Giga bytes (4 GB free is ok):

system.time( ## -> takes much more time: 1 h 47 min !!!
ftL <- fisher.test(freqTable, workspace = 6e8))
##     user   system  elapsed -- ada-16, 2017-09-07
## 6422.367    4.611 6438.832  == 1 h 47 min
ftL
##  Fisher's Exact Test for Count Data
##
## data:  freqTable
## p-value < 2.2e-16
## alternative hypothesis: two.sided
dput(ftL$p.value)
## 2.65517303157606e-49

I have not had the patience to also try the hybrid=TRUE case .. it may take less than the 1h 47min of the exact case.

Statistically, the answers above of course are sufficient and indeed, in many such cases, to use simulate.p.value = TRUE is more realistic here.