Solved – How to combine two beta-binomial distributions

bayesianbeta-binomial distributionmixture-distribution

Say I have the following situation. I have two weighted coins:

Coin 1: In the past I've seen this coin flipped 10 times, 8 of which it came up heads. So I can model the probability of $n$ heads out of $N$ coin tosses with a beta-binomial distribution with beta parameters (8, 2).

Coin 2: This one I've seen flipped 15 times, 4 of which came up heads. So again this could be a beta-binomial with beta parameters (4, 11).

Now say I $N$ times randomly choose from a bag of coins that contains 5 coins, 3/5 are of type 1 and 2/5 are of type 2. Each time I flip the coin and put it back. How do I model the probability of getting n heads out of $N$ tosses?

At first I naively though it would be $(3/5)P(n|\text{coin 1})+(2/5)P(n|\text{coin 2})$ but that would be if only one coin were chosen and flipped $N$ times, not if a new coin is chosen between each flip.

I guess what I need is to have a binomial model with a prior that takes into account the uncertainty of which coin is being flipped, some weighted combination of the two beta distributions. How does one go about this and is it computationally tractable?

Best Answer

Probability of drawing first coin follows binomial distribution with probability $\pi= 3/5$ for a single trial. Since you make $N=1000$ tosses, you expect to see first coin $N\pi$ times and the second one $N(1-\pi)$. Probability of tossing a head using either of the coins follows a Bernoulli distributions, with parameters $p_1$ and $p_2$ respectively. Probabilities of drawing one of the coins and of throwing heads are independent, so we expect

$$ N\pi p_1 + N(1-\pi)p_2 $$

heads in total.

This can be checked using a simple simulation:

simfun <- function(N, pi, p1, p2) {
  sum(rbinom(N, 1, ifelse(runif(N)<pi, p1, p2)))
}

set.seed(123)
N <- 1000
pi <- 3/5
p1 <- 8/10
p2 <- 4/15

that returns

> summary(replicate(1e5, simfun(1000, pi, p1, p2)))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  516.0   576.0   587.0   586.7   597.0   659.0
> N*pi*p1 + N*(1-pi)*p2
[1] 586.6667

Formally what you have is a mixture of two Bernoulli distributions $f_1$, $f_2$ with parameters $p_1$ and $p_2$ and mixing proportion $\pi$:

$$ f(x; \pi, p_1, p_2) = \pi f_1(x; p_1) + (1-\pi) f_2(x; p_2) $$

Unfortunately, if the coins are shuffled, and if what you observe are only the heads and tails, then the individual coins are undistinguishable (in a single throw you see only a head, or tail and it tells you nothing about the coin). In total you observe $k$ heads in $N$ trials, and this also does not enable you to identify the coins that were used in the trials -- it could have been a single coin that resulted in heads with overall probability $ \alpha = \pi p_1 + (1-\pi)p_2 $, or any number of distinct coins, so such model is unidentifiable.

If your experiment were different, say you picked a coin randomly and then thrown it a number of times, then the outcome would tell you something about the coin that you picked. In such case you could use a finite mixture model and estimate it easily using e.g. EM algorithm.