Solved – Fishers exact test for more than 2×2 tables

fishers-exact-testsas

I have two categorical variables. One variable has 5 categories, the other has 7 (a 5×7 table) B/c there were less than 5 frequencies in each cell, instead of doing a chi square I want to do a fisher's. However, when I try to run the code in SAS, I get a warning that it will take a long time. It has been now 30 mins and it's still running.

I don't think there is anything wrong with the program/computer, I think it's b/c I have 2 categorical variables that have MORE THAN 2 categories. So it's more of a 5×5 table.

Can someone please tell me if:
a) I am running the correct test?
b) what can I do to actually run the test in SAS. It is taking way to long and I don't think it will actually run. Is there an alternative?

Best Answer

sas can do fisher's exact for r x c tables: "Fisher’s exact test was extended to general tables by Freeman and Halton (1951), and this test is also known as the Freeman-Halton test" (sas help ref). It can take time to run however. Wikipedia also mentions this: wiki fisher exact. If your variable has ordered categories then consider the cmh option instead (cmh in proc freq)

Related Solutions

Solved – Given the power of computers these days, is there ever a reason to do a chi-squared test rather than Fisher’s exact test

You can turn the question around. Since the ordinary Pearson $\chi^2$ test is almost always more accurate than Fisher's exact test and is much quicker to compute, why does anyone use Fisher's test?

Note that it is a fallacy that the expected cell frequencies have to exceed 5 for Pearson's $\chi^2$ to yield accurate $P$-values. The test is accurate as long as expected cell frequencies exceed 1.0 if a very simple $\frac{N-1}{N}$ correction is applied to the test statistic.

From R-help, 2009:

Campbell, I. Chi-squared and Fisher-Irwin tests of two-by-two tables with small sample recommendations. Statistics in Medicine 2007; 26:3661-3675. (abstract)

...latest edition of Armitage's book recommends that continuity adjustments never be used for contingency table chi-square tests;
E. Pearson modification of Pearson chi-square test, differing from the original by a factor of (N-1)/N;
Cochran noted that the number 5 in "expected frequency less than 5" was arbitrary;
findings of published studies may be summarized as follows, for comparative trials:

Yates' chi-squared test has type I error rates less than the nominal, often less than half the nominal;
The Fisher-Irwin test has type I error rates less than the nominal;
K Pearson's version of the chi-squared test has type I error rates closer to the nominal than Yate's chi-squared test and the Fisher-Irwin test, but in some situations gives type I errors appreciably larger than the nominal value;
The 'N-1' chi-squared test, behaves like K. Pearson's 'N' version, but the tendency for higher than nominal values is reduced;
The two-sided Fisher-Irwin test using Irwin's rule is less conservative than the method doubling the one-sided probability;
The mid-P Fisher-Irwin test by doubling the one-sided probability performs better than standard versions of the Fisher-Irwin test, and the mid-P method by Irwin's rule performs better still in having actual type I errors closer to nominal levels.";

strong support for the 'N-1' test provided expected frequencies exceed 1;
flaw in Fisher test which was based on Fisher's premise that marginal totals carry no useful information;
demonstration of their useful information in very small sample sizes;
Yates' continuity adjustment of N/2 is a large over-correction and is inappropriate;
counter arguments exist to the use of randomization tests in randomized trials;
calculations of worst cases;
overall recommendation: use the 'N-1' chi-square test when all expected frequencies are at least 1; otherwise use the Fisher-Irwin test using Irwin's rule for two-sided tests, taking tables from either tail as likely, or less, as that observed; see letter to the editor by Antonio Andres and author's reply in 27:1791-1796; 2008.

Crans GG, Shuster JJ. How conservative is Fisher's exact test? A quantitative evaluation of the two-sample comparative binomial trial. Statistics in Medicine 2008; 27:3598-3611. (abstract)

...first paper to truly quantify the conservativeness of Fisher's test;
"the test size of FET was less than 0.035 for nearly all sample sizes before 50 and did not approach 0.05 even for sample sizes over 100.";
conservativeness of "exact" methods;
see Stat in Med 28:173-179, 2009 for a criticism which was unanswered

Lydersen S, Fagerland MW, Laake P. Recommended tests for association in $2\times 2$ tables. Statistics in Medicine 2009; 28:1159-1175. (abstract)

...Fisher's exact test should never be used unless the mid-$P$ correction is applied;
value of unconditional tests;
see letter to the editor 30:890-891;2011

Solved – Proper analyses for 2×2 contingency tables

Pearson's $\chi^2$ test is useful for a sample of $n$ observations cross-classified by two variables, say $A$ and $B$. These tests test the null hypothesis that $A$ and $B$ are independent variables. So, for an example, if you crossed two strains of D. melanogaster (fruit flies) with different mutations and observed the $F_2$ generation frequencies in $n$ progeny, the $\chi^2$ test tests for linkage of the two traits (i.e., are they on different chromosones [null] or the same chromosomes [i.e., linked, the alternative]).

McNemar's test is used for paired data -- that is, each observation represents a pair of values. For an example, consider a set of $n$ lung cancer patients each with a spouse. You record the smoking habits of the patients and their spouse, and cross classify. Pearson's test would appear to have $2\,n$ observations, but in this case you only have $n$. McNemar's test makes this correction. The hypotheses tested are similar: "Is cancer status related to smoking status?"

I suppose that one could think of this as a "between subjects" vs "within subjects" difference, and there is no doubt that things are similar. I don't see them that way, but I'll confess to not having thought about it much.

In regards to your Question 2,the restriction is on expected cell counts, not observed cell counts. Observed counts are reality, while expected cell counts represent a model. You can think of the restrictions as helping to ensure a decent approximation under the null hypothesis. Reality can (and should) diverge from the model when necessary, but if the model is approximately correct, it would be bad to have a situation where discrepancies get inflated in small cells.

Finally, an exact test is precisely what it says it is. The distribution of the test statistic under the null hypothesis is known exactly. Pearson's $\chi^2$, McNemar's test, and the log-likelihood $\chi^2$ are all based on asymptotic approximations to the distribution of the test statistic under the null hypothesis. Fisher's test, by comparison, notes that conditionally on the marginal totals, the distributions in the two cells of any row (or column) of the table follow a hypergeometric distribution. This insight permits computation of an exact observed significance level ($p$-value) for any given number of observations in the $1, 1$ cell.

Fisher's exact test tests the same null as Pearson's $\chi^2$ and can be used whenever Pearson's is appropriate and in other situations where Pearson's approximation is believed to be unreliable.. Pearson's test also makes use of the information in the marginal totals, and so is also conditional on those totals. Knowing the a priori margins (or even one margin) is unnecessary.

Best Answer

Related Solutions

Solved – Given the power of computers these days, is there ever a reason to do a chi-squared test rather than Fisher’s exact test

Solved – Proper analyses for 2×2 contingency tables

Related Question