Solved – prop.test returning significant p-value but a confidence interval including 0

proportion;

I have been testing the differences in proportions between two groups (A: 10 successes and 22 failures, B: 10 successes and 70 failures). Interestingly, prop.test returns a significant p-value on the 5% level (0.039) but a 95% confidence interval which includes 0.

tab <- matrix(c(10,10,22,70), nrow = 2)
tab

     [,1] [,2]
[1,]   10   22
[2,]   10   70

prop.test(tab)


2-sample test for equality of proportions with continuity
correction

data:  tab
X-squared = 4.2746, df = 1, p-value = 0.03869
alternative hypothesis: two.sided
95 percent confidence interval:
-0.01056535  0.38556535
sample estimates:
prop 1 prop 2 
0.3125 0.1250 

I know prop.test is correcting for continuity using Yate's approach. But it is doing this for both the confidence intervals and the p-value. How can it be that the p-value and the confidence interval give different results?

Best Answer

I ran into the same problem myself. Your question came up as I was googling for explanations. (What I'm saying is that I might not be the utmost authority on this.) With that caveat, here's what I found:

The comment above by @MichaelChernick seems to be right; the reason the confidence interval for your test includes zero is that the confidence interval is only approximately a 95% confidence interval. Google "misbehavior of binomial confidence intervals" (or something similar) for more.

I started looking around and found this applet if anyone is interested in just finding the 95% confidence interval for the difference of two proportions without messing around in R. At the bottom of the page containing the applet, they reference a paper by Newcombe that appears to be widely accepted as the best way to calculate CIs for a difference of proportions (specifically, his method #10.)

I looked at the paper and couldn't really do the math. There are a few R packages that claim to implement this method but the best one I found was the Epi package. I'll use your example to explain:

tab <- matrix(c(10,10,22,70), nrow = 2)
tab

     [,1] [,2]
[1,]   10   22
[2,]   10   70

Fisher's Exact test will give you a precise p-value for your table. The problem is that the test statistic it uses is an odds ratio, so it's not good for calculating the difference of proportions or confidence intervals. It is good for p-values though, so use it to check the p-value:

fisher.test(tab)

Fisher's Exact Test for Count Data

data:  tab
p-value = 0.03
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 1.0 9.7
sample estimates:
odds ratio 
   3.1 

So we have an exact p-value of 0.03 which is significant at the 0.05 level. Now look at the twoby2() function from the Epi package:

library(Epi)
twoby2(tab)

2 by 2 table analysis: 
------------------------------------------------------ 
Outcome   : Col 1 
Comparing : Row 1 vs. Row 2 

       Col 1 Col 2    P(Col 1) 95% conf. interval
Row 1    10    22        0.31     0.177     0.49
Row 2    10    70        0.12     0.069     0.22

                                    95% conf. interval
             Relative Risk:  2.50     1.152     5.43
         Sample Odds Ratio:  3.18     1.172     8.64
Conditional MLE Odds Ratio:  3.14     1.028     9.70
    Probability difference:  0.19     0.027     0.37

         Exact P-value: 0.028 
    Asymptotic P-value: 0.023 
------------------------------------------------------

Note that the results given include the following:

  • Individual proportion estimates with 95% CI
  • Difference of proportions (0.19) with a 95% CI that does not include zero. (Again, according to the authors this function uses the method described in the Newcombe paper)
  • an Exact p-value (which matches Fisher's Exact p-value calculated earlier)
  • an asymptotic p-value. In my particular problem, this p-value matched the one given by the prop.test() function. That is not the case here for some reason.

Hope this helps anyone else who may wind up here!

Related Question