[Math] The black/white hat problem. Is it 50-50

puzzlestatistics

I was asking my friends a riddle about identifying hats. Each person has to correctly identify the colour of their own hat that was put on their head randomlly.
There is no defined number of either colour. So they could be all white or all black or any combination in between.

They gave an answer that gets 50% right but I fired back that getting 50% right is what you would expect, on average, for straight guesses.
They claimed that that would depend on the colours of the hats that are on the people's heads. In other words, if everyone was wearing black then the 50% rule not apply.
This just doesn't "feel" right to me.

Who is correct?

Edit:

This is the puzzle I asked.
You have 100 people standing one behind the other such that the last person can see all the people in front of him/her and so on.
So the last one see 99 and the next sees 98 etc.
They each have a hat put on their head which is black or white. They have no idea how many of each exist.

Assuming they plan on a strategy in advance, how many can get their hat right.
They said that the best way is for the back person to say the colour of the hat on person 99. Person 99 can say his colour. Then 98 will say the colour of the one in front etc.
This was I am guaranteed at least 50 right and maybe more if two consecutive people have the same colour.
My claim was that 50% guaranteed is the same as random (ignoring the extra lucky one if there are consecutive hats). Their counter-claim was that the 50% random guess would only be right if their were exactly 50 of each colour.

Best Answer

Skip forward to below the "=====" for the answer that you're probably looking for. Read the first part to find out why I put it after the "=====".

To make sense of this, you need to think about probability spaces. And to do that right, you need more information about the meaning of the words in your question.

Case 1: There's a distribution $d$ from which the hats for each person are drawn independently randomly. The players guess black/white uniformly randomly. In this case, the expected number of correct guesses is 50 out of 100.

Case 2: There's a distribution as before, again with independent drawing of hat colors, but the players get to look at others' hats before guessing; they then guess black with a probability proportional to the number of black hats they see (out of the total 99 hats they see). (Roughly: if 95 others have black hats, and 4 have white hats, you guess "black" 95 out of 99 times (perhaps by rolling a die to generate your guess). The expected number of correct guesses in this case is always at least 50, but can be far greater. If the distribution $d$ is highly skewed, this strategy wins big. Note that the players are still "guessing randomly" here ... just not uniformly randomly.

Case 3: The hat-placer is an adversary, and has thought about strategies you might employ. The hat-placer carefully chooses a number $k$ of black hats and $100-k$ white hats, and then distributes these randomly among the players by picking uniformly randomly a permutation of the numbers 1...100. (Note that this still meets the condition "that was put on their head randomly"). The players guess uniformly randomly from "black" or "white", without observing the others' hats. The expected number of correct guesses is again 50.

Case 4: Same adversarial setup as in case 3, but the players use the 'bayesian' approach of case 2. In this case, the adversary will presumably optimize, which will turn out to set $k = 50$, and again the expected number of correct guesses is 50.

====

Anyhow, case 2 makes the point that saying what distribution is being used in each step of randomness in the problem is critical to assessing expected values. Just saying "randomly" doesn't guarantee uniform randomness. And "straight guesses" doesn't actually mean much of anything to me, although I'm guessing that to you it means "uniformly randomly chosen from 'black' and 'white'."

Let me ramble on a little further still, and formulate the problem a little differently.

You have a fixed but unknown list of 100 bits, $b_1, \ldots, b_{100}$, each either a $0$ or a $1$.

You generate another list of 100 bits, $c_i$, $i = 1, \ldots, 100$, chosen independently identically distributed from the uniform distribution on the set $\{0, 1\}$.

You ask "What is the expected number of $i$ for which $b_i = c_i$?"

The answer in this case is $50$, and does not depend on the initial bit sequence $b$. The proof is straightforward: the probability space consists of all possible $c$-sequences; there are $2^{100}$ of these, each equally probably.

If we look at the $i$th digit of each of these sequences, in half of them $c_i$ is zero; in the other half, $c_i = 1$. Hence the probability that $c_i$ equals $b_i$ is exactly $1/2$, and the expected value of the event $c_i = b_i$ is $1/2$. By linearity of expectation, the expected number of matching bits is the sum of the expected number of matching first-bits, matching second-bits, and so so, hence is $100$ times $1/2$, or $50$.

Related Question