Solved – the connection between credible regions and Bayesian hypothesis tests

bayesianconfidence intervalcredible-intervalfrequentisthypothesis testing

In frequentist statistics, there is a close connection between confidence intervals and tests. Using inference about $\mu$ in the $\rm N(\mu,\sigma^2)$ distribution as an example, the $1-\alpha$ confidence interval
$$\bar{x}\pm t_{\alpha/2}(n-1)\cdot s/\sqrt{n}$$
contains all values of $\mu$ that aren't rejected by the $t$-test at the significance level $\alpha$.

Frequentist confidence intervals are in this sense inverted tests. (Incidentally, this means that we can interpret the $p$-value as the smallest value of $\alpha$ for which the null value of the parameter would be included in the $1-\alpha$ confidence interval. I find that this can be a useful way to explain what $p$-values really are to people who know a bit of statistics.)

Reading about the decision-theoretic foundation of Bayesian credible regions, I started to wonder whether there is a similar connection/equivalence between credible regions and Bayesian tests.

  • Is there a general connection?
  • If there is no general connection, are there examples where there is a connection?
  • If there is no general connection, how can we see this?

Best Answer

I managed to come up with an example where a connection exists. It seems to depend heavily on my choice of loss function and the use of composite hypotheses though.

I start with a general example, which is then followed by a simple special case involving the normal distribution.

General example

For an unknown parameter $\theta $, let $\Theta$ be the parameter space and consider the hypothesis $\theta\in\Theta_0$ versus the alternative $\theta\in\Theta_1=\Theta\backslash\Theta_0$.

Let $\varphi$ be a test function, using the notation in Xi'an's The Bayesian Choice (which is sort of backwards to what I at least am used to), so that we reject $\Theta_0$ if $\varphi=0$ and accept $\Theta_0$ if $\varphi=1$. Consider the loss function $$ L(\theta,\varphi) = \begin{cases} 0, & \mbox{if } \varphi=\mathbb{I}_{\Theta_0}(\theta) \\ a_0, & \mbox{if } \theta\in\Theta_0 \mbox{ and }\varphi=0\\ a_1, & \mbox{if } \theta\in\Theta_1 \mbox{ and }\varphi=1. \end{cases} $$ The Bayes test is then $$\varphi^\pi(x)=1\quad \rm if\quad P(\theta\in\Theta_0|x)\geq a_1(a_0+a_1)^{-1}.$$

Take $a_0=\alpha\leq 0.5$ and $a_1=1-\alpha$. The null hypothesis $\Theta_0$ is accepted if $\rm P(\theta\in\Theta_0|x)\geq 1-\alpha$.

Now, a credible region $\Theta_c$ is a region such that $\rm P(\Theta_c|x)\geq 1-\alpha$. Thus, by definition, if $\Theta_0$ is such that $\rm P(\theta\in\Theta_0|x)\geq 1-\alpha$, $\Theta_c$ can be a credible region only if $\rm P(\Theta_0\cap\Theta_c|x)>0$.

We accept the null hypothesis if an only if each $1-\alpha$-credible region contains a non-null subset of $\Theta_0$.

A simpler special case

To better illustrate what kind of test we have in the above example, consider the following special case.

Let $x\sim \rm N(\theta,1)$ with $\theta\sim \rm N(0,1)$. Set $\Theta=\mathbb{R}$, $\Theta_0=(-\infty,0]$ and $\Theta_1=(0,\infty)$, so that we wish to test whether $\theta\leq 0$.

Standard calculations give $$\rm P(\theta\leq 0|x)=\Phi(-x/\sqrt{2}),$$ where $\Phi(\cdot)$ is the standard normal cdf.

Let $z_{1-\alpha}$ be such that $\Phi(z_{1-\alpha})=1-\alpha$. $\Theta_0$ is accepted when $-x/\sqrt{2}>z_{1-\alpha}$.

This is equivalent to accepting when $x\leq \sqrt{2}z_{\alpha}.$ For $\alpha=0.05$, $\Theta_0$ is therefore rejected when $x>-2.33$.

If instead we use the prior $\theta\sim \rm N(\nu,1)$, $\Theta_0$ is rejected when $x>-2.33-\nu$.

Comments

The above loss function, where we think that falsely accepting the null hypothesis is worse than falsely rejecting it, may at first glance seem like a slightly artifical one. It can however be of considerable use in situations where "false negatives" can be costly, for instance when screening for dangerous contagious diseases or terrorists.

The condition that all credible regions must contain a part of $\Theta_0$ is actually a bit stronger than what I was hoping for: in the frequentist case the correspondence is between a single test and a single $1-\alpha$ confidence interval and not between a single test and all $1-\alpha$ intervals.

Related Question