Bayesian Inference – Utilizing Extra Information in Beta-Binomial Case

bayesianinference

Say we have two coins with unknown success probabilities $p_1$ and $p_2$. To know more about the probabilities, say that we use Bayesian approach.

To do so, we first set our prior: $P_1\sim Beta(1,1)$ and $P_2\sim Beta(1,1)$.

Tossing both of the coins together, we update each of the prior as usual.

For example, we ran 10 rounds of tossing, 4 Heads on coin 1 and 8 Heads on coin 2.

The posterior should be $P_1\sim Beta(5,7)$ and $P_2\sim Beta(9,3)$.

My question is that what happens if we receive an "extra" piece of information that says "$p_1$ is greater than $p_2$"

Is there any systematic way to accommodate this extra piece of information into the posterior?

Best Answer

What you're referring to in the first part of the question, is the beta-binomial model. where binomial distribution is assumed as the likelihood and beta as a prior, hence by conjugacy posterior is also a beta distribution.

Your problem description in the second part describes a different scenario because it is multivariate. If you know that $p_1 > p_2$, this means that the parameters are dependent and you are talking about some multivariate distribution for the parameters (vs two univariate beta distributions). In such a case, you cannot use two (independent) beta-binomial models. The constraint can be imposed by choosing a multivariate prior for the parameters. For such a model you won't have a closed-form solution, so you would need to use MCMC or some other kind of approximate inference.

Multi-armed Bandit

This is a particular case of a Multi-Armed bandit problem. I say a particular case because generally we don't know any of the probabilities of heads (in this case we know one of the coins has probability 0.5).

The issue you raise is known as the exploration vs exploitation dilemma: do you explore the other options, or do you stick with what you think is the best. There is an immediate optimal solution assuming you knew all probabilities: simply choose the coin with the highest probability of winning. The problem, as you have alluded to, is that we are unsure about what the true probabilities are.

There is lots of literature on the subject, and there are many deterministic algorithms, but since you tagged this Bayesian, I'd like to tell you about my personal favourite solution: the Bayesian Bandit!

The Baysian Bandit Solution

The Bayesian approach to this problem is very natural. We are interested in answering "What is the probability that coin X is the better of the two?".

A priori, assuming we have observed no coin flips yet, we have no idea what the probability of coin B's Heads might be, denote this unknown $p_B$. So we should assign a prior uniform distribution to this unknown probability. Alternatively, our prior (and posterior) for coin A is trivially concentrated entirely at 1/2.

As you have stated, we observe 2 tails and 1 heads from coin B, we need to update our posterior distribution. Assuming a uniform prior, and flips are Bernoulli coin-flips, our posterior is a $Beta( 1 + 1, 1 + 2)$. Comparing the posterior distributions or A and B now:

enter image description here

Finding an approximately optimal strategy

Now that we have the posteriors, what to do? We are interested in answering "What is the probability coin B is the better of the two" (Remember from our Bayesian perspective, although there is a definite answer to which one is better, we can only speak in probabilities):

$$w_B = P( p_b > 0.5 )$$

The approximately optimal solution is to choose B with probability $w_B$ and A with probability $1 - w_B$. This scheme maximizes out expected gains. $w_B$ can be computed in calculated numerically, as we know the posterior distribution, but an interesting way is the following:

1. Sample P_B from the posterior of coin B
2. If P_B > 0.5, choose coin B, else choose coin A.

This scheme is also self-updating. When we observe the outcome of choosing coin B, we update our posterior with this new information, and select again. This way, if coin B is really bad we will choose it less, and it coin B is in fact really good, we will choose it more often. Of course, we are Bayesians, hence we can never be absolutely sure coin B is better. Choosing probabilistically like this is the most natural solution to the exploration-exploitation dilemma.

This is a particular example of Thompson Sampling. More information, and cool applications to online advertising, can be found in Google's research paper and Yahoo's research paper. I love this stuff!

Solved – Bayesian update for Beta distribution

As already noticed by @whuber in a comment to answer by @BruceET, this is not really a Bayesian scenario, since you don't seem to mention any data (nor any likelihood).

From what you are saying, you know that $p \sim \mathsf{Beta}(a, b)$, you also know that $p \ge 1/2$, what translates to knowing that $p$ is distributed according to beta distribution with parameters $a,b$ left truncated at $1/2$.

Same with the Dirichlet distribution, your knowledge that $p_1+p_2\geq p_3+p_3$ is a constraint about the distribution, not an "update" of the prior. Moreover, notice that this constraint leads to situation that may not be possible under Dirichlet distribution, so in fact the statements may be contradictory. The statement is in fact, that the $p_1, p_2, p_3, p_4$ are distributed according to distribution similar to Dirichlet, but constrained.

So...

If you are saying that for $p$ you assume truncated beta distribution as a prior, and want to use it together with some likelihood function and data, it is no more conjugate to binomial distribution, so you would need to use Markov Chain Monte Carlo for estimation. Defining truncated distribution can be done in any probabilistic programming framework, e.g. Stan, PyMC3, JAGS etc.
Same as above applies to the "Dirichlet"-like distribution, but since this is a custom distribution, it would be much more complicated (I have no easy solution for you).
If you are saying that the facts mentioned by you are the only information that you have and will have, and given this information you want to learn something about the distribution (e.g. expected value, quantiles), then this is a typical case of standard Monte Carlo simulation. For truncated beta, you could simply use inverse transform sampling, that is a simple and efficient way of sampling. For the "Dirichlet"-like distribution, it would again, be more complicated, but there are many possible approaches, starting from simple accept-reject sampling, ending at some more sophisticated solutions.

Best Answer

Related Solutions

Solved – Coin flipping, decision processes and value of information

Multi-armed Bandit

The Baysian Bandit Solution

Finding an approximately optimal strategy

Solved – Bayesian update for Beta distribution

Related Question