Prove that this greedy algorithm is optimal or find a counterexample

algorithmsdecision treespythonrecursive algorithmstrees

We play a game where there are arbitrarily many squares in a line and they are numbered $0$, $1$, $2$, etc. We start at square $n >= 0$ and the objective is to get to square $0$ in as few jumps as possible, i.e. find a minimal path from square $n$ to square $0$, following the rules below.

First, let

  • $A = \frac{2}{3}(n-1)$
  • $B = \frac{2}{3}(n-2) + 1$
  • $C = 2n + 1$
  • $D = 2n + 2$

The rules are:

  1. If $n \equiv 0 \pmod 3$, then we can jump to any of: $C$ or $D$
  2. If $n \equiv 1 \pmod 3$, then we can jump to any of: $A$ or $C$ or $D$
  3. If $n \equiv 2 \pmod 3$, then we can jump to any of: $B$ or $C$ or $D$.

So, we always have the option to jump to square $C = 2n + 1$ or to square $D = 2n + 2$, but those get us further from square $0$. Additionally, sometimes we have to the option to jump to an earlier square ($A < n$ or $B < n$) and those get us closer to square $0$.

This suggests a greedy algorithm:

  1. If $n \equiv 1 \pmod 3$, then we jump to square $A = \frac{2}{3}(n-1)$
  2. If $n \equiv 2 \pmod 3$, then we jump to square $B = \frac{2}{3}(n-2) + 1$
  3. Otherwise, we BFS (breadth first search) try square $C = 2n + 1$ and $D = 2n + 2$

I have verified this greedy algorithm is optimal for starting $n < 500$, i.e. it finds a minimal path from square $n$ to square $0$ for starting $n < 500$.

I would like to proof that this greedy algorithm is optimal in general, i.e. for any starting $n$, or find a counterexample – a specific starting $n$ where this algorithm doesn't produce a minimal path.

See bottom for a python implementation of the algorithm.

In the tree diagrams below:

  • the green edges represent going from $n$ to $\frac 2 3 (n-1)$
  • the blue edges represent going from $n$ to $\frac 2 3 (n-2) + 1$
  • the orange edges represent going form $n$ to $2n+1$
  • the red edges represent going form $n$ to $2n+2$

Tree diagram for when we're at a square $n = 3k+1$ for some integer $k > 0$ s.t. $n \equiv 1 \pmod 3$:

enter image description here

Tree diagram for when we're at a square $n = 3k+2$ for some integer $k > 0$ s.t. $n \equiv 2 \pmod 3$:

enter image description here

What I want to proof is that

  • when $n \equiv 1 \pmod 3$, taking the green edge to $\frac 2 3 (n-1)$ is no worse than taking the orange or red edge
  • when $n \equiv 2 \pmod 3$, taking the blue edge to $\frac 2 3 (n-2) + 1$ is no worse than taking the orange or red edge

Implemented in python, this greedy algorithm looks like this:

from collections import deque

def getPathFromParents(n, parents):
    path = [0]
    i = 0
    while (i := parents[i]) is not None:
        path.append(i)
    return path

def probablyMinPathToSquare_greedy(n): # Assumes n >= 0
    # Greedy BFS search from square n to square 0

    # If
    # Case 0. If n % 3 == 0, then we can jump to: 2n+1 or 2n+2
    # Case 1. If n % 3 == 1, then we can jump to: (2/3) (n-1) or 2n+1 or 2n+2
    # Case 2. If n % 3 == 2, then we can jump to: (2/3) (n-2) + 1 or 2n+1 or 2n+2
    orig_n = n
    parents = {}
    q = deque([(n, None)])
    while True:
        n, parent = q.popleft()
        if n not in parents:
            parents[n] = parent
            if n == 0:
                break
            if n % 3 == 1:
                prev = 2*(n-1) // 3
                q.append((prev, n))
            elif n % 3 == 2:
                prev = (2*n - 1) // 3
                q.append((prev, n))
            else: # only jump to a later square if it is not possible to jump to an earlier square
                A = 2*n + 1
                B = A + 1
                q.append((A, n))
                q.append((B, n))
    
    return getPathFromParents(orig_n, parents)

By contrast, a non-greedy, exhaustive BFS search looks like this:

from collections import deque

def getPathFromParents(n, parents):
    path = [0]
    i = 0
    while (i := parents[i]) is not None:
        path.append(i)
    return path

def minPathToSquare_exhastive(n): # Assumes n >= 0
    # Exhaustive BFS search from square n to square 0
    # paths are searched in increasing length, so that the first path that is found is guaranteed to be minimal.

    # If
    # Case 0. If n % 3 == 0, then we can jump to: 2n+1 or 2n+2
    # Case 1. If n % 3 == 1, then we can jump to: (2/3) (n-1) or 2n+1 or 2n+2
    # Case 2. If n % 3 == 2, then we can jump to: (2/3) (n-2) + 1 or 2n+1 or 2n+2
    orig_n = n
    parents = {}
    q = deque([(n, None)])
    while True:
        n, parent = q.popleft()
        if n not in parents:
            parents[n] = parent
            if n == 0:
                break
            if n % 3 == 1:
                prev = 2*(n-1) // 3
                q.append((prev, n))
            elif n % 3 == 2:
                prev = (2*n - 1) // 3
                q.append((prev, n))
            A = 2*n + 1
            B = A + 1
            q.append((A, n))
            q.append((B, n))
    
    return getPathFromParents(orig_n, parents)

but this exhaustive search causes repl process died unexpectedly: signal: killed when I run it for starting $n = 54$.


Best Answer

To begin with, note that for all $x$ (formally and ignoring any game-specific restrictions on the argument) $$ \left. \begin{align} A(D(x))&=D(A(x))&B(C(x))&=C(B(x))\\ A(C(C(x)))&=D(C(A(x)))&A(C(D(x)))&=D(C(B(x)))\\ B(D(C(x)))&=C(D(A(x)))&B(D(D(x)))&=C(D(B(x))). \end{align}\quad\right\}\quad(\star) $$ For $n\in\Bbb N_0$, let $g(n)$ be the shortest length of a greedy path fro $n$ to $0$ and $f(n)$ the shortest length of any path from $n$ to $0$. Let $$ P(m)\quad\equiv\quad \forall n\in\Bbb N_0\colon (f(n)=m\to g(n)= m). $$ We show $$ \forall m\in\Bbb N\colon P(m)$$ by induction on $m$. The base case is trivial.

Suppose $P(k)$ is true for all $k<m$. To show $P(m)$, consider $n\in\Bbb N_0$ with $f(n)=m$ and let $$\tag1 n=a_0\to a_1\color{green}{\to} a_2\color{green}{\to}\cdots\color{green}{\to} a_m=0$$ be a path witnessing this. As trivially $g(n)\ge f(n)$, it suffices to show $g(n)\le m$. By $P(m-1)$, we may assume wlog. that $a_1\color{green}{\to}\cdots\color{green}{\to} a_m$ is greedy (as indicated by the suggestive green arrows in $(1)$). Then also $a_2\color{green}{\to}\cdots \color{green}{\to} a_m$ is greedy. We convert $(1)$ into a path of same length such that the first step $a_0\to a_1$ is allowed in a greedy path.

Case 1: $n\equiv 0\pmod 3$. Then we are done.

Case 2: $n\equiv 1\pmod 3$. Write $n=3k+1$. Then $a_1$ is one of $A(n)=2k$, $C(n)=6k+3$, $D(n)=6k+4$.

Case 2.1: $a_1=A(n)$. Then we are done.

Case 2.2: $a_1=D(n)$. Then $a_1\equiv 1\pmod 3$ and by greediness of the tail, $a_2=A(a_1)=A(D(n))=D(A(n))$. We obtain a path $$ n\color{green}{\stackrel A\to} A(n)\stackrel D\to a_2\color{green}{\to} \cdots \color{green}{\to} a_m=0.$$

Case 2.3: $a_1=C(n)$. Then $a_1\equiv 0\pmod 3$, so $a_2$ is one of $C(a_1)=12k+7$ and $D(a_1)=12k+8$.

Case 2.3.1: $a_2=C(a_1)$. Then $a_2\equiv 1\pmod 3$ and by greediness of the tail, $a_3=A(a_2)=A(C(C(n)))=D(C(A(n)))$. We obtain a path $$ n\color{green}{\stackrel A\to} A(n)\stackrel C\to C(A(n))\stackrel D\to a_3\color{green}{\to} \cdots \color{green}{\to} a_m.$$

Case 2.3.2: $a_2=D(a_1)$. Then $a_2\equiv 2\pmod 3$ and by greediness of the tail, $a_3=B(a_2)=B(D(C(n)))=C(D(A(n)))$. We obtain a path $$ n\color{green}{\stackrel A\to} A(n)\stackrel D\to D(A(n))\stackrel C\to a_3\color{green}{\to} \cdots \color{green}{\to} a_m.$$


Case 3: $n\equiv 2\pmod 3$. The argument is the same as in case 2 and with corresponding subcases, but we swap $A\leftrightarrow B$ and $C\leftrightarrow D$ and use the right column of $(\star)$ instead of the left one.

Case 3: $n\equiv 2\pmod 3$. Write $n=3k+2$. Then $a_1$ is one of $B(n)=2k+1$, $C(n)=6k+5$, $D(n)=6k+6$.

Case 3.1: $a_1=B(n)$. Then we are done.

Case 3.2: $a_1=C(n)$. Then $a_1\equiv 2\pmod 3$ and by greediness of the tail, $a_2=B(a_1)=B(C(n))=C(B(n))$. We obtain a path $$ n\color{green}{\stackrel B\to} B(n)\stackrel C\to a_2\color{green}{\to} \cdots \color{green}{\to} a_m=0.$$

Case 3.3: $a_1=D(n)$. Then $a_1\equiv 0\pmod 3$, so $a_2$ is one of $C(a_1)=12k+13$ and $D(a_1)=12k+14$.

Case 3.3.1: $a_2=C(a_1)$. Then $a_2\equiv 1\pmod 3$ and by greediness of the tail, $a_3=A(a_2)=A(C(D(n)))=D(C(B(n)))$. We obtain a path $$ n\color{green}{\stackrel B\to} B(n)\stackrel C\to C(B(n))\stackrel D\to a_3\color{green}{\to} \cdots \color{green}{\to} a_m.$$

Case 3.3.2: $a_2=D(a_1)$. Then $a_2\equiv 2\pmod 3$ and by greediness of the tail, $a_3=B(a_2)=B(D(D(n)))=C(D(B(n)))$. We obtain a path $$ n\color{green}{\stackrel B\to} B(n)\stackrel D\to D(B(n))\stackrel C\to a_3\color{green}{\to} \cdots \color{green}{\to} a_m.$$


So we may assume that the step $a_0\color{green}{\to} a_1$ in $(1)$ is greedy. As witnessed by the tail of $(1)$, we have $f(a_1)\le m-1$. By induction hypothesis, $g(a_1)\le m-1$. But then $a_0\color{green}{\to} a_1$ followed by a greedy path of length $\le m-1$ is a greedy path for $a_0$ of length $\le m$, as was to be shown.