[Math] Ulam’s problem – guessing a chosen number in a set

algorithmsdiscrete mathematicsrecurrence-relationsrecursive algorithms

I tried to solve the following problem, which I found in the book "Discrete Mathematics and Its Applications", by Kenneth Rosen (Problem 28 of the section 7.3 of the 6th Edition):

Suppose someone picks a number $x$ from a set of $n$
numbers. A second person tries to guess the number
by successively selecting subsets of the $n$ numbers and
asking the first person whether $x$ is in each set. The
first person answers either “yes” or “no.” When the first
person answers each query truthfully, we can find $x$
using $\log n$ queries by successively splitting the sets
used in each query in half. Ulam’s problem, proposed by
Stanislaw Ulam in 1976, asks for the number of queries
required to find $x$, supposing that the first person is allowed
to lie exactly once.

a) Show that by asking each question twice, given a number
$x$ and a set with $n$ elements, and asking one more
question when we find the lie, Ulam’s problem can be
solved using $2 \log n + 1$ queries.

b) Show that by dividing the initial set of $n$ elements into
four parts, each with $n/4$ elements, $1/4$ of the elements
can be eliminated using two queries. [Hint: Use two
queries, where each of the queries asks whether the
element is in the union of two of the subsets with $n/4$
elements and where one of the subsets of $n/4$ elements
is used in both queries.]

c) Show from part (b) that if $f (n)$ equals the number
of queries used to solve Ulam’s problem using the
method from part (b) and $n$ is divisible by 4, then
$f (n) = f(3n/4) + 2$.

d) Solve the recurrence relation in part (c) for $f (n)$.

e) Is the naive way to solve Ulam’s problem by asking
each question twice or the divide-and-conquer
method based on part (b) more efficient?

(Note: here, $\log n$ stands for the base-2 logarithm of $n$.)

I will post the whole solution as an answer to my own question, because I want to know if the solution is completely correct and consistent. Thank you in advance.

Best Answer

My solution to this problem:

a) Show that by asking each question twice, given a number $x$ and a set with $n$ elements, and asking one more question when we find the lie, Ulam’s problem can be solved using $2 \log n + 1$ queries.

I will use the simplifying assumption that $n$ is a power of two; that is, $n=2^k$ for some nonnegative integer $k$.

I will successively split the sets in half, and I will ask if $x$ is in each half. Because I split each set in half, for $n=2^k$ a total of $\log n$ splits will be made in the whole process. I know that the first person is allowed to lie exactly once. Then, if I make the same question twice for the same half, I know that the first person lied if he gives different answers to both questions. In this case, if I make one more question, I will know for sure in which half $x$ is. But I will only need to make this extra question at most once in the whole process, because there can be only one lie at most. Thus, in the worst case, I will make $2\log n + 1$ questions (two for each subset and one extra question when I find a lie).

b) Show that by dividing the initial set of $n$ elements into four parts, each with $n/4$ elements, $1/4$ of the elements can be eliminated using two queries. [Hint: Use two queries, where each of the queries asks whether the element is in the union of two of the subsets with $n/4$ elements and where one of the subsets of $n/4$ elements is used in both queries.]

Following the hint, suppose that I divide the initial sets into four subsets with $n/4$ elements: $A$, $B$, $C$ and $D$. Without loss of generality, I make two questions: (1) whether the element $x$ is in $A\cup B$ and (2) whether the element $x$ is in $A\cup C$. There are four possibilities for the answer to these questions:

If yes is answered to both questions, then $x$ must be in $A\cup B$ or $A\cup C$ (even if only one answer is truthful). Thus, I can be certain that $x$ is not in $D$. Therefore, $D$ can be eliminated using two queries.
If no is answered to both questions, then $x$ is not in $A\cup B$ or not in $A\cup C$. Thus, it is certainly not in $A$, so $A$ can be eliminated.
If yes is answered to question (1) and no is answered to question (2), with at most one lie, several possibilities arise. Analyzing the possibilities below, we conclude that $C$ can always be eliminated in this case:
1. If both answers are truthful, then $x$ must not be in $A\cup C$; thus, $A$ or $C$ could be eliminated;
2. If the first answer is a lie, then $x$ is not in $A\cup B$ and not in $A\cup C$; therefore, $A$, $B$ and $C$ could be eliminated;
3. If the second answer is a lie, then $x$ is in $A$; therefore, $B$, $C$ or $D$ could be eliminated.
If no is answered to question (1) and yes is answered to question (2), by an analogous reasoning, $B$ can always be eliminated in this case.

Therefore, it is always possible to eliminate one subset of $n/4$ elements by using two queries.

c) Show from part (b) that if $f (n)$ equals the number of queries used to solve Ulam’s problem using the method from part (b) and $n$ is divisible by 4, then $f (n) = f(3n/4) + 2$.

By part (b), if $n$ is divisible by four, I can solve this problem by asking 2 questions to eliminate $n/4$ elements of the initial set, and then ask $f(3n/4)$ questions for the rest of the $3n/4$ elements. Therefore:

$$f (n) = 2 + f(3n/4)$$

d) Solve the recurrence relation in part (c) for $f (n)$.

(Note: I edited out a previous solution that was incorrect, as pointed out in the comments.)

Here, we are assuming that $n$ is a power of ~~four~~ 4/3. Successively applying the recurrence relation:

$$\begin{align*} f(n)&=f(3n/4)+2\\ &=(f((3/4)^2n)+2)+2\\ &=f((3/4)^3n)+3\times 2\\ &=\cdots\\ &=f(1)+2k \end{align*}$$

$k$ is the number of times that $n$ is multiplied by $3/4$ (and therefore divided by $4/3$) until we reach the base case, $f(1)$. Therefore, $k=\log_{4/3} n$. Thus:

$$f(n)=f(1)+2\log_{4/3}n=2\log_{4/3}n$$

Here, $f(1)=0$, because, if the list has only one element, I don't need to ask any question.

e) Is the naive way to solve Ulam’s problem by asking each question twice or the divide-and-conquer method based on part (b) more efficient?

The naive way, given in item (a), uses $2 \log n + 1$ queries. The divide-and-conquer method uses $2\log_{4/3}n$ queries. Therefore, it seems that the first method is more efficient, because it reduces the original problem to half, whereas the second method reduces it to only $3/4$ of the original problem.

Best Answer

Related Solutions

Algorithms – Guess the Number Despite False Answer

Number of disjoint non-empty subsets [pigeonhole]

Related Question