Solved – wilson’s adjustment for sample proportion

proportion;sample

Could somebody provide a derivation for Wilson's adjustment for the CI of a sample proportion? The formula is:

$$\tilde{p} \approx \frac{x+2}{n+4}$$

where $x$ is the number of successes, and $n$ is the sample size.

Specifically, I want to know: where do the 2 and the 4 come from? Those numbers seem a bit arbitrary to me, and I've had no luck with Google.

Just trying to scratch my curiosity here…

Best Answer

To my understanding the Wilson estimate is the center of the Wilson interval, which gives the estimate

$$\tilde{p}=\frac{\hat p + \frac{1}{2n} z^2}{1 + \frac{1}{n} z^2}=\frac{X+ \frac{1}{2} z^2}{n + z^2}\,.$$

It is also the center of the Agresti-Coull interval.

If you take $\,\alpha=0.05\ $ and round 1.96 to 2, that gives the $\frac{X+2}{n+4}$ ("add 2 to both the successes and the failures") which is specifically mentioned in Agresti And Coull's paper, but rounding 1.96 to 2 is so often done that I'd be surprised if some people weren't using it since Wilson's paper appeared.


Wilson, E. B. (1927),
"Probable inference, the law of succession, and statistical inference,"
Journal of the American Statistical Association 22, 209–212.

Agresti, Alan; Coull, Brent A. (1998),
"Approximate is better than 'exact' for interval estimation of binomial proportions,"
The American Statistician 52: 119–126.
(This is at the first author's web pages here.)

Both papers are quite relevant.

Related Question