Solved – How does fisher.test calculate the confidence interval for the odds ratio in R

confidence intervalfishers-exact-testodds-ratior

The fisher.test function in base R by default returns a confidence interval for the odds ratio in a 2×2 contingency table. For example:

> x <- c(100, 5, 70, 12)
> dim(x) <- c(2,2)
> fisher.test(x)

    Fishers Exact Test for Count Data

data:  x
p-value = 0.02291
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
  1.058526 12.904604
sample estimates:
odds ratio 
  3.406113 

The confidence interval of an odds ratio is an extremely useful thing to know, and I would like to refer to it in an article I am currently writing. My dataset has high enough n for a chi-square test, but the latter would only give me the test statistic and a p-value, which are harder to interpret than the confidence interval of an odds ratio. However, I cannot find any explanation of how the confidence interval is being calculated in this case, nor of what the theoretical precedent might be for calculating confidence intervals of odds ratios as part of a Fisher test (as opposed to a logistic regression).

Can anyone shed some light?

Best Answer

The R help manual cites the Fisher letter to the Australian Journal of Statistics.

In it he notes, by example:

If the observations in a $2 \times 2$ table are distinctly out of proportion (and indeed in other cases also) we may wish to set limits to the true product ratio, e.g. the observed table

$$ \begin{array}{cc} 10 & 3 \\ 2 & 15 \end{array}$$

gives a crude ratio of 25. How small could the true ratio be in reasonable consistency with the data? If the expectation in the four classes were

$$ \begin{array}{cc} 10-x & 3+x \\ 2+x & 15-x \end{array}$$

the true ratio would be $(10-x)(15-x)/(3+x)(2+x)$m and $\chi^2$ for the observations would be:

$$\chi^2 = x^2 \left( \frac{1}{10-x} + \frac{1}{3+x} + \frac{1}{2+x} + \frac{1}{15-x} \right)$$

so if $x$ were 3.0, $$\chi^2 = 3^2 (0.59286) = 5.3357$$ with one degree of freedom.

The exact probability of such a small sample of 30 giving 10 or more in the first quadrant is the partial sum of a hypergeometric series, and not easy to calculate for if $\xi$ stand for the theoretical product ratio, the frequencies of 0 to 12 in the quadrant will be proportional to the terms:

$$ 1, \frac{13 \times 12}{1\times 6}\xi, \frac{13\times 12 \times 12 \times 11}{1 \times 2 \times 6 \times 7}\xi^2, \ldots, \frac{13!12!5!}{(13-r)!(12-r)!(5+r)!}\xi^i,\ldots$$

It would not be too difficult, as in the exact test for disproportionality, to calcuate the last three terms for any chosen value of $\xi$, but for the ratio of these to the whole we would require the sum of the entire series or $$F(-13, -12, 6, \xi)$$ which would be best obtained by calculating all the terms and summing them, a process too lengthy to be recommended.

Using Yates' adjustment, however, we can at once find: $$\chi^2_c = (2.5)^2 0.59286 = 3.7054$$.

Further taking $x=3.1$ we have

$$ \chi^2_c = (2.6)^2(0.58717) = 3.9693$$

Interpolating for the tabular entry 3.841 it appears that $x=3.0501$ and the cross product ratio is 2.718.

So that it may be inferred from the data that the true cross-product ratio exceeds 2.718 unless a coincidence of one in forty has occurred, Similar limits can be set in both directions and at all limits of probability.