Probability – Average Number of Guesses to Guess Number Between 1 and 1000?

probabilitystatistics

There is a game where you get to pick a number between 1 and 1000 and then the system will tell if you guessed to high or too low. Using a binary search you can find the answer in a maximum of ten guesses (2^10 = 1024) , but how would I go about calculating the average number of guesses required to guess the given method, assuming the user uses this binary chop method?

Best Answer

Let us take the more ideal case of guessing integers between $1$ and $2^{10}-1=1111111111_2$ writing in base $2$ from now on without explicitly writing subscript $2$ each time. Then the algorithm for the guesses will be like this:

  1. The initial guess is $1\overline 0$ where $\overline 0$ means fill up with zeros until we have a ten digit number
  2. Write the $n$-th guess as $g_n=x_n1\overline 0$ where $x_n$ is a binary string of length $(n-1)$
  3. If the game responds by "higher", the next guess is $g_{n+1}=x_n11\overline 0$
  4. If the game responds by "lower", the next guess is $g_{n+1}=x_n01\overline 0$
  5. If the game responds by "correct" we are done after $n$ guesses

So we see that the game terminates whenever there are only zeros remaining right of $x_n1$. Then note that there are $2^{n-1}$ binary strings $x_n$ of length $(n-1)$. So $2^{n-1}$ numbers are guessed after exactly $n$ guesses. So for this setup the expected number of guesses will be, returning to writing in base $10$: $$ \frac{1}{2^{10}-1}\sum_{n=1}^{10}n\cdot 2^{n-1}=\frac{9217}{1023}\approx9.00978 $$ So the average case here is very close to the worst case of $10$ guesses. This could seem odd at first, but odd is also the clue here ... Half of the time it takes $10$ guesses to reveal that the randomly picked number $x$ is in fact odd. The first $9$ guesses fills up with zeros which in base $2$ means "divisible by $2$". One quarter of the time $x$ is divisible by $2^2=4$ forcing the algorithm to make $9$ guesses before succes. And so on.


I wrote the following code in Python:

L=[1,1000]
G={}
for n in range(1,11):
    i=0
    while i<len(L)-1:
        guess = int(0.5*(L[i]+L[i+1]+1))
        if guess not in L:
            L = L[:i+1]+[guess]+L[i+1:]
            i+=1
            G[guess] = n
        i+=1
S=0
for g in G:
    S+=G[g]
print (10+S+10)

which responded with $8987$, so the average for the $1$ to $1000$ case, assuming that we always make the guess of the rounded average of the limits of the interval we know that $x$ belongs to, must be $8.987$ guesses.

Related Question