Solved – Is a Fisher’s Exact test in a 7×9 contingency table feasible

contingency tablesfishers-exact-testspss

I have a question regarding the use of SPSS (i.e. its capability) to execute a Fisher's Exact test for large, sparse RxC contingency tables. I would like to test whether or not a certain correlation exist between my rows (diagnostic groups) and columns (laboratory tests) (see below).

Currently, I have a contingency table of 7 rows and 9 columns, that includes the data of 164 patients. Rows consist of certain diagnostic groups (e.g. different neurological diseases grouped together in a 'neurological disease' diagnostic group) while columns are the number of patients that have undertaken a certain laboratory test. However, multiple diagnostic groups have zero patients that underwent a certain test, and only 11.11% of the results are larger than 5 patients. As such, I can not use the Chi square test. Many suggest combining different rows and columns to evade this problem. However, in this case, I would like to avoid this since I am researching whether or not certain tests are associated with certain diagnostic groups, and by combining rows/columns, my research question would be in vain.

Therefore, I am looking for other possibilities to determine the presence of a correlation. One of my options would be to use a Fisher's exact test for R x C tables, since this test does not use the assumptions of the Chi squared test. I already know how to execute a Fisher's Exact test in SPSS for R x C tables. However, is it feasible to execute a Fisher's exact test for a table as large as 7×9, or should I use other statistical tests (and if yes, which tests do you recommend)?

Thanks in advance.

Best Answer

Transcribing the comments into an answer.

There should be an argument to calculate an approximate p value using Monte Carlo simulations specifically for tables larger than 2 x 2. The argument is merely a modification of the Fisher test but it is still done in the fashion of the Fisher test.

Here is an extract from the r documentation In the r x c case with r > 2 or c > 2, internal tables can get too large for the exact test in which case an error is signalled. Apart from increasing workspace sufficiently, which then may lead to very long running times, using simulate.p.value = TRUE may then often be sufficient and hence advisable.