Solved – How to get the confidence interval of a Bernoulli trial if $\hat{p} = 0$

bernoulli-distributionconfidence intervalsmall-sample

I know the standard formula for the Bernoulli CI is:

$$\hat{p}\pm z_{1-\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$$

If $\hat{p} = \frac{m}{n}$ how do I estimate the confidence interval when$\ n$ is small and$\ m = 0$? This case would collapse the above equation to $\ 0 \pm 0$, which implies that the confidence interval does not improve with larger $\ n$.

In my mind, the CI should start at [0,1] and the upper bound should decrease as $\ n$ increases, given that $\ m$ remains at 0.

Best Answer

The reason the usual "CLT" confidence interval becomes 0 is because when $p$ is very close to 0 or 1 (and the relative number of samples is low), the CLT becomes a bad approximation. This is because when $p=0,1$, your random variable is constant. On the other hand, when $p$ is very close to 1 or 0, you need a very large amount of samples to distinguish $p$ from exactly 1 or 0.

There are a couple of approaches to get the true confidence interval. The easy way is to appeal to the Wilson score interval:

$$\frac{1}{1 + \frac{1}{n} z^2} \left[ \hat{p} + \frac{1}{2n} z^2 \pm z \sqrt{ \frac{1}{n}\hat{p} \left(1 - \hat{p}\right) + \frac{1}{4n^2}z^2 } \right].$$

The second option is to numerically estimate the true confidence interval by explicitly using the binomial distribution, as opposed to appealing to the normal distribution.

Related Solutions

Confidence Interval – How Confidence Interval for Parameter p of a Bernoulli Trial Varies with p Value

The method you use, a normal approximation, is an archaicism and should never be taught or even offered as an option in software. It has very poor coverage properties, particularly for small proportions as in your example.

There are many alternative approaches to calculating these intervals, with varying assumptions and coverage characteristics. Some are very ad hoc in design and so are hard to prefer for pedagogic purposes. My preference is the method of Wilson, sometimes called Wilson's scores intervals. It approximates a conditional interval and has excellent frequentist properties.

See this answer for a little more detail: Discrete functions: Confidence interval coverage?

See this question for a formal statement of the meaning of different types of CI for binomial proportions: Statement of result for binomial confidence intervals

This one for confidence interval coverage: Clarification on interpreting confidence intervals?

Solved – What does a confidence interval with a negative endpoint mean

When the procedure you have used to calculate a confidence interval gives an interval including impossible values, that is an indication of problems with the method. In your case, you have used a normal (central limit theorem-based) CI with so few observations that the approximation is invalid. You can test that easily in R, say:

We have plotted the loglikelihodd function for your case. If this is (close to) quadratic, the normal approximation will be good. That is clearly not the case here!

As @Glen_b says in a comment, you need to read up on binomial confidence intervals, see for instance Wikipedia or Binomial confidence interval estimation - why is it not symmetric?.

R code used for the plot:

make_loglik <- function(n, x) {
     function(p) dbinom(x, n, p, log=TRUE)  
    }

loglik <- make_loglik(10, 1)

plot(loglik, from=0, to=1, xlab="p", col="blue", main="log likelihood function\nBernoulli, n=10, x=1")

Best Answer

Related Solutions

Confidence Interval – How Confidence Interval for Parameter p of a Bernoulli Trial Varies with p Value

Solved – What does a confidence interval with a negative endpoint mean

Related Question