Solved – If a credible interval has a flat prior, is a 95% confidence interval equal to a 95% credible interval

bayesianconfidence intervalcredible-intervalestimationprior

I'm very new to Bayesian statistics, and this may be a silly question. Nevertheless:

Consider a credible interval with a prior that specifies a uniform distribution. For example, from 0 to 1, where 0 to 1 represents the full range of possible values of an effect. In this case, would a 95% credible interval be equal to a 95% confidence interval?

Best Answer

Many frequentist confidence intervals (CIs) are based on the likelihood function. If the prior distribution is truly non-informative, then the a Bayesian posterior has essentially the same information as the likelihood function. Consequently, in practice, a Bayesian probability interval (or credible interval) may be very similar numerically to a frequentist confidence interval. [Of course, even if numerically similar, there are philosophical differences in interpretation between frequentist and Bayesian interval estimates.]

Here is a simple example, estimating binomial success probability $\theta.$ Suppose we have $n = 100$ observations (trials) with $X = 73$ successes.

Frequentist: The traditional Wald interval uses the point estimate $\hat \theta = X/n = 73/100 = 0.73.$ And the 95% CI is of the form $$\hat \theta \pm 1.96\sqrt{\frac{\hat \theta(1-\hat \theta)} {n}},$$ which computes to $(0.643,\,0.817).$

n = 100;  x = 73;  th.w = x/n;  pm = c(-1,1)
ci.w = th.w + pm*1.96*sqrt(th.w*(1-th.w)/n);  ci.w
[1] 0.6429839 0.8170161

This form of CI assumes that relevant binomial distributions can be approximated by normal ones and that the margin of error $\sqrt{\theta(1-\theta)/n}$ is well approximated by $\sqrt{\hat\theta(1-\hat\theta)/n}.$ Particularly for small $n,$ these assumptions need not be true. [The cases where $X = 0$ or $X = n$ are especially problematic.]

The Agresti-Coull CI has been shown to have more accurate coverage probability. This interval 'adds two Success and two Failures' as a trick to get a coverage probability nearer to 95%. It begins with the point estimate $\tilde \theta = (X+2)/\tilde n,$ where $\tilde n + 4.$ Then a 95% CI is of the form $$\tilde \theta \pm 1.96\sqrt{\frac{\tilde \theta(1-\tilde \theta)} {\tilde n}},$$ which computes to $(0.612, 0.792).$ For $n > 100$ and $0.3 < \tilde \theta < 0.7,$ the difference between these two styles of confidence intervals is nearly negligible.

ci.a = th.a + pm*1.96*sqrt(th.a*(1-th.a)/n);  ci.a
[1] 0.6122700 0.7915761

Bayesian: One popular noninformative prior in this situation is $\mathsf{Beta}(1,1) \equiv \mathsf{Unif}(0,1).$ The likelihood function is proportional to $\theta^x(1-\theta)^{n-x}.$ Multiplying the kernels of the prior and likelihood we have the kernel of the posterior distribution $\mathsf{Beta}(x+1,\, n-x+1).$

Then a 95% Bayesian interval estimate uses quantiles 0.025 and 0.975 of the posterior distribution to get $(0.635, 0.807).$ When the prior distribution is 'flat' or 'noninformative' the numerical difference between the Bayesian probability interval and the Agresti-Coull confidence interval is slight.

qbeta(c(.025, .975), 74, 28)
[1] 0.6353758 0.8072313

Notes: (a) In this situation, some Bayesians prefer the noninformative prior $\mathsf{Beta}(.5, .5).$ (b) For confidence levels other than 95%, the Agresti-Coull CI uses a slightly different point estimate. (c) For data other than binomial, there may be no available 'flat' prior, but one can choose a prior with a huge variance (small precision) that carries very little information. (d) For more discussion of Agresti-Coull CIs, graphs of coverage probabilities, and some references, perhaps also see this Q & A.

Related Question