[Math] Exercise 2.8 in Mackay’s Information Theory, Inference and Learning Algorithms

[Editing question per Leon's suggestions – thanks for these!]

Could someone walk me through a solution to Ex 2.8?

2.7: Bill tosses a bent coin $N$ times, obtaining a sequence of heads and tails. We assume that the coin has a probability $f_H$ of coming up heads; we do not know $f_H$. If $n_H$ heads have occurred in $N$ tosses, what is the probability distribution of $f_H$? …What is the probability that the $N+1$th outcome will be a head, given $n_H$ heads in $N$ tosses?

2.8: Assuming a uniform prior on $f_H$, $P(f_H)=1$, solve the problem posed in 2.7. Sketch the distribution of fH and compute the probability that the $N+1$th outcome will be a head, for
(A) $N=3$ and $n_H$=0;
(B) $N=3$ and $n_H=2$;
(C) $N=10$ and $n_H=3$;
(D) $N=300$ and $n_H=29$.

{tip about the beta integral}

Where I am stuck is with the switch to continuous probabilities, and using integrals rather than sums. Had no problem with 2.4; 2.5 took some doing but was fine. The example in 2.6 made sense walking through it.

In working on 2.8, I can write down that posterior = (likelihood x prior) / evidence, and know that I am trying to solve for posterior (to find the distribution of $f_H$). So my equation will look something like

$$P(f_H |\,n_H, N) = {P(n_H|\,f_H, N) P(f_H) \over P(n_H|\,N)}$$.

The left hand side of the numerator should just be the binomial probability

${N \choose n_H}$ $f_H^{n_H}$ $(1-f_H)^{N-n_H}$ Based on the statement in the question, I assume that $P(f_H) = 1$ and ignore it.

The denominator is the marginal probability of $n_H$. I believe this should be an integral – something like $\int_0^1 P(f_H) P(n_H |\,f_H, N) df_H$. But I am not sure that this is correct, and am not sure how to solve it, even with the hints.

I did notice that 2.7 is an example and the additional assumptions – but need help here too.

Thank you in advance

[Not technically homework as I'm not doing this as part of a course, but it's close enough to tag it]

Best Answer

The question is essentially solved in §3.2. Use the equation: $$posterior = \frac{likelihood \times \ prior}{evidence}$$ and work it through.

Assume a uniform prior - $P(f_H) = 1$.

The evidence $P(n_H| \ N)$ is the normalizing constant - $\int_0^1 df_H \ f_H^{n_H} {(1-f_H)}^{N-n_H}$.

Based on the hint about the beta integral, $\int_0^1 df_H \ f_H^{n_H} {(1-f_H)}^{N-n_H} = \frac{n_H! \ (N-n_H)!}{(N+1)!}$.

The likelihood is $P(n_H| \ f_H,N) = f_H^{n_H} {(1-f_H)}^{N-n_H}$

So the posterior probability is just the ratio of the likelihood over the evidence.

$$P(f_H | \ n_H, N) = \frac{f_H^{n_H} {(1-f_H)}^{N-n_H}}{\frac{n_H! \ (N-n_H)!}{(N+1)!}}$$

To find the probability of the $N+1^{th}$ toss, integrate over $f_H$. By the sum rule, $$P(h \ |\ n_H, N) = \int df_H P(h | \ f_H) \ P(f_H | \ n_H, N) $$

$P(h| \ f_H) = f_H$, so we have $\int_0^1 df_H\frac{f_H^{n_H+1} {(1-f_H)}^{N-n_H}}{\frac{n_H! \ (N-n_H)!}{(N+1)!}}$

Using the beta integral again, this becomes $$\frac{(n_H+1)! (N-n_H)!}{(N+2)!} \ \times \ \frac{(N+1)!}{n_H! (N-n_H)!} = \frac {n_H +1}{N+2}$$

Best Answer

Related Solutions

[Math] Why does probability of 50/50 outcome decrease as we toss a coin more and more

Related Question