R – Why Confidence Interval from R’s prop.test() Differs from Hand Calculation and SAS Results

approximationbinomial distributionconfidence intervalnormal distributionr

I'm wondering if anyone has insight into how prop.test() in R calculates its confidence intervals. Although it doesn't state it explicitly in its documentation, my understanding is that this function uses the normal approximation to the binomial. This assumption is based upon the following:

  1. In the documentation, it states its parameter value represents "the degrees of freedom of the approximate chi-squared distribution of the test statistic."
  2. It has the option to apply Yates's continuity correction
  3. I've seen this function used in many examples (albeit online, but at many sites with a .edu suffix if that means anything) where the normal is used to approximate the binomial

If prop.test() really does use the normal approximation to the binomial, I would think the CI it calculates would be a Wald-type interval, but, again, the documentation doesn't state this explicitly.

The issue is the result I get is not congruent with what I get by hand, which is the same as what SAS gives. Here's an example:

Let

x = Number of successes = 319

n = Fixed number of trials = 1100

$\alpha$ = 0.01/Confidence level = 0.99

The Wald-type interval, by hand, is thus

$\hat p \mp z_\frac{\alpha}{2}\sqrt{\frac{\hat p(1 – \hat p)}{n}} = 0.29 \mp 2.575829(0.01368144) = (0.25476, 0.32524)$

I get the same result in SAS as shown here (upper portion of the output is the approximate CI, the lower is the exact CI based on the binomial):

enter image description here

However, when I run prop.test() in R (also without Yates's correction, to be consistent with my hand calculation and SAS), I get something slightly different:

enter image description here

The 99% CI from the output above is (0.2561, 0.3264).

Any thoughts on where this discrepancy is coming from? Doesn't seem attributable to rounding error, nor would I expect an R function to round enough during its calculations so as to affect the third decimal place anyway.

Best Answer

The method is not stated verbosely in the details section of ?prop.test but suitable references are given. Wilson's score method is used, see: Wilson EB (1927). "Probable Inference, the Law of Succession, and Statistical Inference." Journal of the American Statistical Association, 22, 209-212.

This is found by Newcombe (1998) - also referenced on ?prop.test - to have much better coverage than the traditional Wald-type interval. See: Newcombe RG (1998). "Two-Sided Confidence Intervals for the Single Proportion: Comparison of Seven Methods." Statistics in Medicine, 17, 857-872. There it is called method 3 and 4 (without and with continuity correction, respectively).

Thus, you can replicate the confidence interval

prop.test(319, 1100, conf.level = 0.99, correct = FALSE)$conf.int
## [1] 0.2561013 0.3264169
## attr(,"conf.level")
## [1] 0.99

with

p <- 319/1100
n <- 1100
z <- qnorm(0.995)
(2 * n * p + z^2 + c(-1, 1) * z * sqrt(z^2 + 4 * n * p * (1 - p))) /
(2 * (n + z^2))
## [1] 0.2561013 0.3264169

Of course, the "exact" binomial (Clopper & Pearson), discussed as method 5 in Newcombe (1998), is also available in binom.test:

binom.test(319, 1100, conf.level = 0.99)$conf.int
## [1] 0.2552831 0.3265614
## attr(,"conf.level")
## [1] 0.99