Number Guessing Game

game theoryprobability

I’m thinking of a number between 1 and 40 (inclusive). You get six
attempts to guess my number. After every guess, I’ll tell you either
“higher”, “lower”, or “correct.” If I ever respond “higher” three
times in a row or “lower” three times in a row, the game ends and you
automatically lose. If you correctly guess my number on one of your 6
guesses, you win.

Your first guess is 20, and I respond “lower” (meaning that my number
is lower than 20).

a. What should your second guess be in order to maximize your chances
of winning?

b. If I say higher after your second guess, what should
your third guess be?

c. If I say lower after your second guess, what should your third guess be?

d. Say we are playing Guess My Number where instead of my number being between 1 and 40, it is between 1 and
N (inclusive), and instead of 6 guesses, you get 8 guesses. What is
the largest value of N such that you have a strategy that is
guaranteed to win?

This seems like we should utilize binary search to some degree. My thoughts are, instead of guessing at the 50th percentile, we can guess at the 66th or 33rd percentile to make over/underestimates to reduce the probability of the same response 3 times in a row. So maybe my second guess would be 6 instead of 10 and third guess would be 14 if it asked for higher. Is this intuition reasonable?

Best Answer

The winning result sequences need to be mapped in lexicographical order (with “lower” $\lt$ “correct” $\lt$ “higher”) to the possible numbers. To solve a smaller example first: If you have four guesses and no two consecutive guesses may be the same, the winning result sequences are, in order: LC, LHLC, LHC, C, HLC, HLHC, HC. So you can differentiate between the numbers from $1$ to $7$, and you can read off the required guesses from the sequence by always guessing the number that has C in the current position: The first guess is $4$, and then for “lower” you need to guess $1$, then $3$ and then $2$ (and symmetrically opposite for “higher”).

In part a, the numbers $1$ to $19$ remain, and the winning sequences (without the inital “lower”) LC, LHLLC, LHLC, LHLHC, LHC, LHHLC, LHHC, C, HLLC, HLLHC, HLC, HLHLC, HLHC, HLHHC, HC, HHLLC, HHLC, HHLHC, HHC remain, in that order. So we have exactly the right number of sequences left to differentiate between the numbers left, and reading off the required guesses yields the decision tree linked to in Daniel Mathias’s comment under the question. This also yields the answers to parts b and c.

For part d, it would be rather cumbersome to list all the sequences by hand, so we can ask more generally: How many sequences of at most $n$ letters in {L, C, H} are there that contain a single C at the end and don’t contain three consecutive Ls or Hs? We can just drop the final C and equivalently ask for the number of strings with at most $n-1$ letters in {L, H} that don’t contain three consecutive Ls or Hs.

Let $a_n$ denote the number of such strings, and $b_n$ the number of them that don’t end in L. Then $a_n=2b_n-1$ (since the empty string ends in neither letter), and $b_n=b_{n-2}+b_{n-1}+1$ (since an admissible string not ending in L is either the empty string or can be produced by appending H or HH to a string not ending in H). Thus $b_n+1$ satisfies the recurrence of the Fibonacci numbers, and since $b_1=1$ and $b_2=2$, it follows that $b_n=F_{n+2}-1$ and thus $a_n=2F_{n+2}-3$.

So we have $a_6=2F_8-3=39$, in agreement with the fact that you were able to distinguish $39$ numbers with $6$ guesses (if the answer to your initial guess of $20$ had been “higher”, you would only have been able to distinguish $19$ of the $20$ remaining numbers from $21$ to $40$); and for $n=8$, we find that with $8$ guesses you could distinguish between $a_8=2F_{10}-3=107$ numbers.

While each unrestricted question allows you to distinguish twice as many numbers, each of these restricted questions only increases the number of distinguishable numbers by about a factor of $\phi=\frac{\sqrt5+1}2\approx1.618$, the golden ratio, so instead of one bit per unrestricted question you only gain about $\log_2\phi\approx0.694$ bits of information per restricted question.

Related Question