[Math] Is this proof of the Monty Hall problem correct

monty-hallsolution-verification

The Monty Hall problem can be stated as follows:

Suppose you're on a game show, and you're given the choice of three
doors: Behind one door is a car; behind the others, goats. You pick a
door, say No. 1, and the host, who knows what's behind the doors,
opens another door, say No. 3, which has a goat. He then says to you,
"Do you want to pick door No. 2?" Is it to your advantage to switch
your choice?

Paul Erdös said once

Es sei „hoffnungslos“ für jemanden, der sich in Entscheidungsbäumen
und mit dem Satz von Bayes nicht auskenne, die Lösung zu verstehen

which can be translated to

Its "hopeless" for someone to understand the solution who does not
know about decision-making trees and the Bayes' theorem

I am surprised by this statement, because I think its possible for anyone to understand the solution without knowing Bayes' theorem. I would state my proof as follows:

You pick one of three doors at random. The probability that you hit a door with a goat is $\frac{2}{3}$. Now the gamemaster opens a different door with a goat and asks you if you want to switch. If your in front of a door with a goat, then you would win if you switch, otherwise you would lose. The question "Do you want to switch" is thus equivalent to as asking "Do you think you hit a door with a goat". Thus you would win with $\frac{2}{3}= 66.666..$% percent if you change.

Now that I read the statement form Erdös I am insecure about the above proof. Is anything about the above argument not correct? Did I implicit use Bayes' theorem? Or was Erdös simply wrong with his statement that you can't understand the goat problem without Bayes' theorem?

Best Answer

Lovely quote. Where did you get it? Because it is absolutely correct. (Note: At first, Erdös famously thought switching couldn't matter, until he was shown a simulation that proved it did. Apparently, from your quote, he then considered why he was wrong. While it seems trivial, very few people do this, which is why I want to see where you got that quote.)

Yes, something is not correct in your proof. It is not a proof. It is a way to extend intuition from one case, to another, without resorting to actual logic. Sometimes you can get the right answer that way, but is it serendipity when you do.

For example, it is the same kind of reasoning that leads people to say "Since there are two doors left, and it was equally likely that the car was placed behind either, each now has a 50% chance." Unless your solution can show why this is an inferior solution - and not just that you got a different answer - it is not a proof. And the reason "probability paradoxes" like this continue, is because such non-proofs are considered acceptable if people believe they get the right answer.

What Erdös means in that quote, is that in order to prove the result, you need to compare the probabilities that the host would open Door #3 when (A) the car is behind Door #1, (B) the car is behind Door #2, or (C) the car is behind Door #3. Two of these are obvious: case (B) is 100%, and case (C) is 0%.

The point Erdös was making, is that the answer is determined entirely by how the host decides whether to open Door #2, or Door #3, in case (A). We were not told, explicitly, what that probability is, so we can only assume it is 50% (since the host could also have opened Door #2).

The proven (via Bayes' Theorem, which is why it is important) probability for each case, is its probability divided by the sum of all three (which is 150%). That is, (A) (50%)/(150%)=1/3, (B) (100%)/(150%)=2/3, and (C) (0%)/(150%)=0.

But what if we know that the host makes that decision in a biased way? It could be, say, 75%. Then the answers are (75%)/(175%)=3/7, (100%)/(175%)=4/7, and 0. Or - and this is more easily demonstrated - what if he always opens Door #3 if it has a goat? The the probability in case (A) is 100%, and the answers are (A) 1/2, (B) 1/2, and (C) 0. This is what Erdös implicitly assumed before he was shown the simulation, and the reason it is wrong is because it doesn't take the host's decision tree into account.

Related Question