[Math] Most efficient strategy for guessing outcome of (fair) dice roll

diceinformation theoryprobability

I'm starting to learn about information theory and I'm a bit stuck on this one. here's what I have so far:

1 possible strategy is to simply ask 'did outcome 1 occur?' if yes then we have our answer, if no we ask again 'did outcome 2 occur' etc. for a dice, the maximum number of questions with this strategy would be 5, since if for 1 – 5 the answer is no, that must mean that the outcome 6 must be positive. Based on my calculation, the average number of questions using this strategy is 2.5 (sum from 1 to 5 of $Q/6$, where Q is number of questions and 1/6 is the probabilty of rolling any of the faces on the dice).

Another strategy I thought of would be to split the probabilties – i.e. is the outcome even? if yes, we could ask 'is it greater than 3?' and then we either have the outcome (6) or we ask 'is it 4?' and from here we have a definitive answer. Likewise for the case of the first answer being no i.e. the number is odd.

I struggled to calculate the average number of questions for this. My logic was we must ask at least 2 questions for the answer to be fully determined. So, we must ask 1 question and then the outcome of the second question is subject to probabilty. Hence the average number of questions is:
$1+(2/6)+(3/6)=1.83333…$

Is this right? Is my logic correct? Are there any other strategies worth looking at?

I'm really enjoying imformation theory and am really keen to learn more more more!

Best Answer

Each yes-no question asked can provide at most one bit of information. A fair dice roll has six equally probable outcomes, so $\log_26=2.585$ bits of entropy. Any strategy to correctly guess the dice roll cannot use fewer than this many questions on average.

A strategy that halves the search space with each new question as evenly as possible will take two questions to correctly guess for two rolls (say 1, 6) and three questions for the other four rolls, thus averaging $\frac83=2.667$ questions. This is optimal, since if three rolls only required two questions the average number of questions would be 2.5 – below the theoretical lower bound.

The strategy that asks whether the roll is 1, then 2, then 3 and so on requires 5 questions if the roll was 5 or 6, thus requiring $\frac{1+2+3+4+5+5}6=\frac{10}3=3.333$ questions on average; it is sub-optimal.