Given that software can do the Fisher's exact test calculation so easily nowadays, is there any circumstance where, theoretically or practically, the chi-squared test is actually preferable to Fisher's exact test?
Advantages of the Fisher's exact test include:
- scaling to contingency tables larger than 2×2 (i.e any r x c table)
- gives an exact p-value
- not needing to have a minimum expected cell count to be valid
Best Answer
You can turn the question around. Since the ordinary Pearson $\chi^2$ test is almost always more accurate than Fisher's exact test and is much quicker to compute, why does anyone use Fisher's test?
Note that it is a fallacy that the expected cell frequencies have to exceed 5 for Pearson's $\chi^2$ to yield accurate $P$-values. The test is accurate as long as expected cell frequencies exceed 1.0 if a very simple $\frac{N-1}{N}$ correction is applied to the test statistic.
From R-help, 2009:
...latest edition of Armitage's book recommends that continuity adjustments never be used for contingency table chi-square tests;
E. Pearson modification of Pearson chi-square test, differing from the original by a factor of (N-1)/N;
Cochran noted that the number 5 in "expected frequency less than 5" was arbitrary;
findings of published studies may be summarized as follows, for comparative trials:
Yates' chi-squared test has type I error rates less than the nominal, often less than half the nominal;
The Fisher-Irwin test has type I error rates less than the nominal;
K Pearson's version of the chi-squared test has type I error rates closer to the nominal than Yate's chi-squared test and the Fisher-Irwin test, but in some situations gives type I errors appreciably larger than the nominal value;
The 'N-1' chi-squared test, behaves like K. Pearson's 'N' version, but the tendency for higher than nominal values is reduced;
The two-sided Fisher-Irwin test using Irwin's rule is less conservative than the method doubling the one-sided probability;
The mid-P Fisher-Irwin test by doubling the one-sided probability performs better than standard versions of the Fisher-Irwin test, and the mid-P method by Irwin's rule performs better still in having actual type I errors closer to nominal levels.";
strong support for the 'N-1' test provided expected frequencies exceed 1;
flaw in Fisher test which was based on Fisher's premise that marginal totals carry no useful information;
demonstration of their useful information in very small sample sizes;
Yates' continuity adjustment of N/2 is a large over-correction and is inappropriate;
counter arguments exist to the use of randomization tests in randomized trials;
calculations of worst cases;
overall recommendation: use the 'N-1' chi-square test when all expected frequencies are at least 1; otherwise use the Fisher-Irwin test using Irwin's rule for two-sided tests, taking tables from either tail as likely, or less, as that observed; see letter to the editor by Antonio Andres and author's reply in 27:1791-1796; 2008.
...first paper to truly quantify the conservativeness of Fisher's test;
"the test size of FET was less than 0.035 for nearly all sample sizes before 50 and did not approach 0.05 even for sample sizes over 100.";
conservativeness of "exact" methods;
see Stat in Med 28:173-179, 2009 for a criticism which was unanswered
...Fisher's exact test should never be used unless the mid-$P$ correction is applied;
value of unconditional tests;
see letter to the editor 30:890-891;2011