[Math] Is a Turing machine on an arbitrary (finite) alphabet equivalent to one on {0, 1}

computabilitycomputer sciencelogicturing-machines

Brief context:

I'm trying to understand why a Universal Turing Machine exists, on a tape with alphabet $\{0, 1\}$. I think I can see that a $3$-tape Turing machine can represent a Universal Turing Machine, using tape $1$ for the input corresponding to the Turing machine to simulate, tape $2$ corresponding to a buffer containing the current state, and tape $3$ being the actual computation on the input.

So this reduces to understanding why any function computable by a multi-tape Turing machine is also computable by a single-tape Turing machine.

Now, by adding in extra possible symbols to my alphabet, I can see how I could simulate a multi-tape Turing machine with a single-tape Turing machine with a larger alphabet (something like: Add in a delimiter '#' to separate the tapes, and symbols $\dot{0}, \dot{1}, \dot{B}$ to represent the current position on each tape). However, it's not clear to me how it would be possible to do this on a tape where we're only allowed the alphabet $\{0, 1\}$.

My question is:

Given a Turing machine $T$ on some (finite) alphabet $\Sigma \supseteq \{0, 1\}$ that calculates a partial function $\mathbb{N} \to \mathbb{N}$ (using the binary representation), does there exist a Turing machine $T'$ just on the alphabet $\{0, 1\}$ that calculates the same partial function?

(This then answers the last part of my thought process above.)

Best Answer

It is certainly possible for a Turing machine with alphabet $\{0,1\}$ to simulate a Turing machine with any finite alphabet. The idea is that if the larger alphabet has size less than $2^k$ then you can divide the tape into "chunks" of size $k$ and use each of these chunks to encode a single character from the larger alphabet. This requires a larger number of states in the new machine, because recognizing a single symbol from the larger alphabet requires a sequence of $k$ states in the new machine.

This is slightly analogous to using UTF-32 to store strings of arbitrary Unicode characters in a programming language that only supports 8-bit characters.