[Math] How to a turing machine solve the element distinctness problem

turing-machines

I am reading an example in Sipser's famous book on the theory of computation. In this example, Sipser creates a turing machine M to solve the element distinctness problem. M is given a list of strings over $\Sigma = \{0, 1\}$ separated by #s. M must only accept if all the separated strings are different.

The book lists the following high-level implementation of such a machine.

  1. Place a mark on the top of the leftmost tape symbol. If that symbol was blank (_), accept. If that symbol was a #, then continue with the next stage. Otherwise, reject.
  2. Scan right to the next # and then place a mark over it. If no # is encountered before a blank symbol, accept since only one string is listed.
  3. By zig-zagging, compare the 2 strings to the right of the marked #s. Reject if they are equal.
  4. Move the rightmost of the two marks to the next # to the right. If no # symbol is encountered before a blank symbol, move the leftmost mark to the next # to its right and the rightmost mark to the # after that. This time, if no # is available for the rightmost mark, all the strings have been compared, so accept.

I am not sure how to picture this implementation. Why are we putting marks above #? Aren't we just comparing strings? Also, what is "zig-zagging"?

Best Answer

The basic idea is this. You have an input tape with $\#w_1\#w_2\#\ldots\#w_n$, where the $w_k$ are strings over $\{0,1\}$. You set a left marker, indicated in red, at the first $\#$:

$$\color{red}{\#}w_1\#w_2\#\ldots\#w_n$$

Then you scan to the right until you reach a $\#$ and set a right marker, indicated in blue:

$$\color{red}{\#}w_1\color{blue}{\#}w_2\#\ldots\#w_n$$

Now read the first character of $w_1$, mark it as read, and scan forward to compare it with the first character of $w_2$; if they’re equal, mark the first character of $w_2$ as read and scan back to find the second character of $w_1$. (You can find it, because you marked the first character as read.) Mark it read and scan forward to the second character of $w_2$; if they’re equal, mark the second character of $w_2$ read and scan back to the third character of $w_1$. Continue in this fashion until one of two things happens. If every character of $w_1$ matches the corresponding character of $w_2$, and $w_2$ has no extra characters, reject the string. Otherwise, go back and erase the reading marks that you made in $w_1$ and $w_2$ and move the right marker to the next $\#$:

$$\color{red}{\#}w_1\#w_2\color{blue}{\#}\ldots\#w_n$$

Repeat; this time you’ll be comparing $w_1$ with $w_3$. If you get a match, you’ll reject the input, and if not, you’ll advance the blue marker again. If $w_1$ is different from $w_2,\dots,w_n$, you’ll eventually reach a blank instead of a $\#$ when you try to advance the blue marker; at that point you go back and advance the red marker and re-initialize the blue marker to the next $\#$ to the right:

$$\#w_1\color{red}{\#}w_2\color{blue}{\#}w_3\#\ldots\#w_n$$

The next full cycle compares $w_2$ successively with $w_3,w_4,\dots,w_n$, rejecting the input if a match is found. If no match is found, you advance the red marker another step, reset the blue marker, and repeat.

Related Question