Solved – What does a confidence interval with a negative endpoint mean

bernoulli-distributionconfidence intervalinterpretation

I have 10 iid r.v. with Bernoulli distribution with $X_{i} = 1$ for a positive result. I'm given $\sum_{i=1}^{10} X_i= 1$ and need to find a two-sided 99% confidence interval for $\theta$.

So $\alpha = 0.01$ and $z_{\alpha/2} = 2.575$ based on the standard normal distribution. The variance is given $\sigma^{2} = \hat{\theta}(1-\hat{\theta})$, with $\theta = 1/10$ from the given information. The confidence interval is:
$$
\left(\hat{\theta} – z_{\alpha/2}\frac{\sigma}{\sqrt{n}}, \hat{\theta} + z_{\alpha/2}\frac{\sigma}{\sqrt{n}}\right)
$$

so my confidence interval is $(-0.143, 0.343)$. Have I messed up a calculation somewhere? I'm not sure what this confidence interval tells me … perhaps that the sample size is too small to really get any useful information on $\theta$?

Best Answer

When the procedure you have used to calculate a confidence interval gives an interval including impossible values, that is an indication of problems with the method. In your case, you have used a normal (central limit theorem-based) CI with so few observations that the approximation is invalid. You can test that easily in R, say:

Bernoulli loglikelihood function

We have plotted the loglikelihodd function for your case. If this is (close to) quadratic, the normal approximation will be good. That is clearly not the case here!

As @Glen_b says in a comment, you need to read up on binomial confidence intervals, see for instance Wikipedia or Binomial confidence interval estimation - why is it not symmetric?.

R code used for the plot:

make_loglik <- function(n, x) {
     function(p) dbinom(x, n, p, log=TRUE)  
    }

loglik <- make_loglik(10, 1)

plot(loglik, from=0, to=1, xlab="p", col="blue", main="log likelihood function\nBernoulli, n=10, x=1")
Related Question