Solved – Arithmetic for updating likelihoods using Bayes theorem

bayesian

This may be an elementary question which is why I have not been able to find it on Stackexchange or Mathoverflow however I am having problems with the arithmetic involved in updating likelihoods using Bayes theorem for a problem I am working on.

Background:

I am attempting to give likelihood forecasts to future events which have no or few precedents. Unlike most of the literature and texts on Bayes which use previously known distributions to give likelihoods on future events within similar parameters – my situation is founded on expert opinion only with few or no reasonable distributions to reference.

Example:

GM announced they are developing a new car but didn't say when it would be released. The Production Manager for KIA needs to know when they will be ready to release it so that they can release their new car around the same time.

KIA knows that that the new car needs the following components in order to be ready for release (1) engine, (2) transmission, (3) body, (4) Wheels and Suspension.
KIA's experienced engineers state that for a new project like this they are 90% confident that it can be completed in two years. KIA also found out that GM did a test with the new transmission in another SUV and it worked as designed with a 95% success rate. The same engineers stated that given this transmission test a car can be completed within that time frame 70% of the time.

The way I have it, at this point KIA can start the Bayesian calculation with the initial sample as below:

   A = GM will release the new car in two years
   B1 = GM will successfully test a new transmission
   P(A) = Prior Probability that GM will release the new car in two years
   P(B1) = Probability that GM will successfully test a new transmission
   P(B1|A) = Likelihood that given a successful transmission test, the car will be released within 2 years

Assigning values as follows

   P(A) = .9
   P(B1) = .95
   P(B1|A) = .7

$$
P(A|B_1) = \frac {P(A)P(B_1|A)}{P(A)P(B_1|A)+P(\bar{A})P(B_1|\bar{A})}
$$

$$
.9545 = \frac {.9*.7}{(.9*.7)+(.1*.3)}
$$

Shortly after the KIA statistics department gave this update, GM announced that they had tested their new engine and it had a 98% success rate over all its tests. The KIA engineers said that typically if there is a successful engine test that there is a 80% likelihood that a car will be completed on time – but that they did not know what the likelihood on the overall completion time was given both and engine and a transmission test was.

Values now for our second bit of evidence, which should be noted are independent for this case – but are not in all cases for example the body must go on after the suspension:

   P(B2) = .98
   P(B2|A) = .8

So here is where I am having trouble: arithmetically integrating the posterior P(A|B1) into the calculation for P(A|B1,B2), given that priors should stay constant. As I mentioned, some events within {$B_1…B_n$} are independent, others are conditional.

I have seen the wikipedia entry which describes three event bayes extention:

$$
P(A|B_1,B_2) = \frac {P(B_2|A,B_1)P(B_1|A)P(A)}{P(B_2|B_1)P(B_1)}
$$

however what about a fourth and fifth extension?

Most of the books and online resources I have do not show the steps for updating priors in any fashion that I can discriminate. It could be that I am too far removed from my undergraduate calculus days to interpret it, but my fear is that I need to have significant experience in set theory and graduate level maths in order to do what would appear to be a simple calculation. This exchange is the closest I could find and even it does not step through it. The fact that I have not after a week of searching found a basic tutorial on the mechanics of updating Bayes theorem (not mind you on what Bayes theorem is and how it works – there are more that enough of those) beyond the first implementation, makes me think it is not a trivial calculation. Is there a straightforward way to do this updating without graduate level mathematics?

Note: I am aware of the irony related to the inherent difficulty of the "updating problem" WRT Bayes as Yudkowski has gone on about it for some time. I was assuming, perhaps incorrectly, that those working on it were referencing much more complex iterations, however I am aware that it could be the case I am running into that issue.

Best Answer

I'll start by answering your question about updating events with the "fourth and fifth extensions." As you suspected, the arithmetic is indeed quite simple.

First, recall how Bayes theorem is derived from the definition of conditional probability:

enter image description here

By conditioning on A in the numerator we can get to the more familiar form:

enter image description here

Now consider if we don't have just B, but rather 2 or more events B_1, B_2... For that, we can derive the three event Bayes extension you cite using the chain rule of probability, which is (from wikipedia):

enter image description here

For B_1 and B_2, we start with the definition of conditional probability

enter image description here

And use the chain rule on both the numerator and the denominator:

enter image description here

And just like that we've rederived the equation you cite from wikipedia. Let's try adding another event:

enter image description here

enter image description here

Adding a fifth event is equally simple (an exercise for the reader). But you'll surely notice a pattern, namely that the answer to the three-event version is held within the answer to the four-event version, so that we can rewrite this as:

enter image description here

enter image description here

Or more generally, the rule for updating the posterior after the nth piece of evidence:

enter image description here

That fraction there is what you're interested in. Now, what you're talking about is that this might not be easy to calculate- not because of any arithmetic difficulty, but because of dependencies within the B's. If we say each B is independently distributed, updating becomes very simple:

enter image description here

(In fact, you'll notice that is a simple application of Bayes' theorem!) The complexity of that fraction depends on which of previous pieces of evidence your new piece of evidence depends on. The importance of conditional dependence between your variables and your pieces of evidence is precisely why Bayesian networks were developed (in fact, the above describes factorization of Bayesian networks).

Now, let's talk about your example. First, your interpretation of the word problem has an issue. Your interpretations of 70% and 80% are, respectively,

P(B1|A) = .7
P(B2|A) = .8

But (per your definitions) A means the car will be completed on time, B_1 means GM tests the transmission successfully, and B_2 means there is a successful engine test, which means you're getting them backwards- they should be

P(A|B1) = .7
P(A|B2) = .8

Now, however, the word problem doesn't really make sense. Here are the three problems:

1)They're effectively giving you what you're looking for: saying "given this transmission test a car can be completed within that time frame 70% of the time", and then asking "what is the probability a car will be completed in that time".

2) The evidence pushes you in the opposite direction that common sense would expect. The probability was 90% before you knew about the transmission, how can knowing about a successful test lower it to 70%?

3) There is a difference between a "95% success rate" and a 95% chance that a test was successful. Success rate can mean a lot of things (for example, what proportion a part doesn't break), which makes it an engineering question about the quality of the part, not a subjective assessment of "how sure are we the test succeeded?" As an illustrative example, imagine we were talking about a critical piece of a rocket ship, which needs at least a 99.999% chance of working during a flight. Saying "The piece breaks 20% of the time" does not mean there is an 80% chance the test succeeded, and thus an 80% chance you can launch the rocket next week. Perhaps the part will take 20 years to develop and fix- there is no way of knowing based on the information you're given.

For these reasons, the problem is very poorly worded. But, as I indicated above, the arithmetic involved in updating based on multiple events is quite straightforward. In that sense, I hope I answered your question.

ETA: Based on your comments, I'd say you should rework the question from the ground up. You should certainly get rid of the idea of the 95%/98% "success rate", which in this context is an engineering question and not a Bayesian statistics one. Secondly, the estimates of "We are 70% confident, given that this part works, that the car will be ready in two years" is a posterior probability, not a piece of evidence; you can't use it to update what you already have.

In the situation you are describing, you need all four parts to work by the deadline. Thus, the smartest thing to do would be simply to say "What is the probability each part will be working in two years?" Then you take the product of those probabilities (assuming independence), and you have the probability the entire thing will be working in two years.

Stepping back, it sounds like you are actually trying to combine multiple subjective predictions into one. In that case, my recommendation would be to fire your engineers. Why? Because they are telling you that they are 90% confident that it will be ready in two years, but then, after learning of a successful test of the transmission, downgrading their estimates to 70%. If that's the talent we're working with, no Bayesian statistics is going to help us :-)

More seriously- perhaps if you were more specific about the type of problem (which is probably something like combining P(A|B1) and P(A|B2)), I could give you some more advice.

Related Question