Constructing Aperiodic Substitution Sequences with Complexity Functions

co.combinatoricsinfinite-sequencesinteger-sequencessequences-and-series

The Fibonacci word is a binary sequence defined as follows.
We use a substitution rule $0\to 01$, $1\to 0$. Then, starting with the binary string $0$, apply the substitution rules successively. So we get $S_0=0$, $S_1=01$, $S_2=010$, $S_3=01001,\ldots$ and ultimately we get the following aperiodic sequence in $\{0,1\}^{\mathbb Z_+}$, known as the Fibonacci word:
$$
0100101001001\ldots
$$

For $S\in \{0,1\}^{\mathbb Z_+}$ and $n\in\mathbb Z_+$, let us define the complexity function $\sigma_n(S)$ as the number of distinct subwords of length $n$ in $S$. It is well-known that if $S$ is the Fibonacci word defined above then $\sigma_n(S)=n+1$. Here is my question:

Given a real number $x>1$. Construct a binary substitution rule $0\to A$, $1\to B$ (here $A$ and $B$ are finite binary strings) where, starting with $0$ and applying the substitution rule successively we get an infinite aperiodic sequence $S$ in $\{0,1\}^{\mathbb Z_+}$ so that $$\lim_{n\to\infty} \frac{\sigma_n(S)}{n}= x$$

Note that $A$ and $B$ must be chosen so that if $S_k$ is the binary string we get starting with $0$ and applying the substitution rule $k$ times, then $S_k$ is a prefix of $S_\ell$ for every $\ell>k$. That way the sequence $S_k$ stabilizes to an infinite binary sequence $S$ as $k$ tends to infinity.

Best Answer

There are a few things to say here! This is impossible for multiple reasons.

Firstly, there are only countably many substitutions, so there's no hope of achieving every possible $x$ as the "slope" of the complexity function.

More importantly, it's actually not possible for ANY subshift $S$ to satisfy $\sigma_n(S)/n \rightarrow x$ for $x \notin \mathbb{N}$. This is a theorem due to Heinis for the case $x \in (1,2)$ and Cassaigne in the general case.

I should note that this theorem is for two-sided subshifts, and in general I'd need to think about whether it's possible to do more with one-sided subshifts (I suspect not though). But for your question this distinction is moot; substitution sequences are recurrent and so the two-sided subshift generated (the so-called natural extension) has the same language. In other words, for substitutions, the only $x$ that can be achieved in your way are integers.

Another very cool result is due to Pansiot, who showed that for substitutions, the word complexity is "near" (i.e. bounded above and below by constant multiples) either $n$, $n^2$, $n \log n$, or $n \log \log n$.