Solved – Expected value of posterior vs. success probability

bayesclassificationmaximum likelihoodposteriorprobability

Context

Suppose I have two models, $H_1$ and $H_2$ for which I know the prior probabilities $p(H_1)$ and $p(H_2)$.
Furthermore, I know the class-conditional distributions $p(x|H_1)$ and $p(x|H_2)$ of a random variable $X \in \mathbb{R}^n$. I get to observe a realization of $X$, call it $x_0$.

The posterior distribution of the models given the observation I just made can be obtained via Bayes' formula:

$$p(H_i|x_0) = \frac{p(x_0|H_i)p(H_i)}{p(x_0)}$$

And the Bayes-optimal decision rule is the (very intuitive) MAP rule:

$$\hat{H} = \operatorname*{arg\,max}_{H_i} p(H_i|x_0)$$

Now, let's say I'm interested in how certain I am when I'm deciding for $H_1$. The probability that I'm correctly identifying $H_1$ when indeed $X$ is generated from model $H_1$ is:

$$P_{suc,H_1} = \int_{R_1}p(x|H_1)dx,$$

where $R_1$ is the decision region associated to $H_1$, i.e.

$$R_1 = \{x \in \mathbb{R}^n : \operatorname*{arg\,max}_{H_i} p(H_i|x_0) = H_1\}$$

Problem

Now that I've introduced the terminology & notation, let me come to the actual question 😉

In my case, $R_1$ has a complicated shape (it's a pretty much arbitrary convex polytope), and I'm just not able to perform the integration in order to obtain $P_{suc,H_1}$. I'm looking for alternative ways to get this quantity.

My intuition is as follows: the probability of success is related (equal?) to the expected value of the posterior under the hypothesis $H_1$. Mathematically:

$$P_{suc, H_1} \longleftrightarrow \mathbb{E}_{p(x|H_1)}[p(H_1|x)]$$

or:

$$\int_{R_1}1p(x|H_1)dx \longleftrightarrow \int_{\mathbb{R}^n}p(H_1|x)p(x|H_1)dx$$

Actually, I've already managed to prove that the two are not equal (in the left integral, replace $1$ with $p(H_1|x) + p(H_2|x)$; in the right integral, use $\mathbb{R}^n = R_1 + R_2$; some terms cancel out and you can analyze what's left over.)

Questions

I guess I really have two questions:

  1. What's the interpretation of the quantity $\int_{\mathbb{R}^n}p(H_1|x)p(x|H_1)dx$? Intuitively, it should relate to $P_{suc,1}$, but I'm not sure how / why.
  2. Is there any other way I could compute the error probability by integrations over some simpler regions (even if the integrand becomes more complicated)?

Best Answer

I hope I've understood your question. So first thing is that the two quantities you stated may be equal are not equal in general (as you yourself deduced). To see this more explicitly (again, as you have pointed out),

\begin{align} P_{suc,H_1} &= \int_{R_1} p(x|H_1) \text{d}x\\ &= \int_{R_1} 1.p(x|H_1) \text{d}x + \int_{R_1^c} 0.p(x|H_1) \text{d}x \end{align}

and for comparison, the other quantity is

\begin{align} \mathbb{E}_{x \sim H_1}\left[ p(H_1 |x) \right] &= \int p(H_1|x)p(x|H_1) \text{d}x\\ &= \int_{R_1} p(H_1|x)p(x|H_1) \text{d}x + \int_{R_1^c} p(H_1|x)p(x|H_1) \text{d}x \end{align}

To answer your first question, see that \begin{align} P_{suc,H_1} &= \int_{R_1} p(x|H_1) \text{d}x = \int \mathbb{1}_{\{ x \in R_1 \}} p(x|H_1) \text{d}x = \mathbb{E}_{x \sim H_1} \left[\mathbb{1}_{\{ x \in R_1 \}} \right] \end{align} in other words, if you were to vary $x$ according to $H_1$, what would the expectation be of the random variable $\mathbb{1}_{\{ x \in R_1 \}}$. This is different from $\mathbb{E}_{x \sim H_1}\left[ p(H_1 |x) \right]$ which asks a similar question but replacing $\mathbb{1}_{\{ x \in R_1 \}}$ with $p(H_1|x)$. The first random variable is a on-off quantity while the latter is a typically smooth function.

To answer your second question, if you willing to achieve an approximate answer for $P_{suc,H_1} = \mathbb{E}_{x \sim H_1} \left[\mathbb{1}_{\{ x \in R_1 \}} \right]$ - try random sampling $x \sim H_1$, computing $\mathbb{1}_{\{ x \in R_1 \}}$ for each sample, and take the sample mean.

Related Question