[Math] Non-regular languages fulfilling the Pumping Lemma

automata-theoryco.combinatoricslo.logic

Some non-regular languages don't yield to the Pumping Lemma ($L_1=a^nb^mc^m$ should work). But now consider the set of non-regular languages L only over the alphabet {a}. (Like $L_2=a^{n^2}$ or whatnot). The Pumping Lemma applied to some L now can be expressed as a statement about arbitrarily long arithmetic sequences in L. The example thus immediately yields, $L_2$ "grows" too fast. Please give a simple example $L_3$ that does not yield to the Pumping Lemma (and needs Myhill-Nerode or such).
Does already the complement of $L_2$ work?

Best Answer

Initial observation. Let me start by explaining that the answer to your final question is that no, the complement of $L_2$ does not have the pumping property. To see this, suppose that it had a pumping number $p$, so that any string of length at least $p$ in the complement of $L_2$ could be pumped. That is, if $w$ has length at least $p$ and is in the language, then $w=xyz$, where $y$ is nontrivial with length at most $p$, and $xy^kz$ is also in the language for any $k$. Since all these words are just powers of the single generator $a$, we may think just about the exponents. In terms of the exponents, what the pumping property is saying here is that for any nonsquare natural number $m\geq p$, we may write $m=x+y+z$, where $1\leq y\leq p$, such that $x+ky+z$ is not a square for any $k$. Since there are only finitely many possible $y$ that arise, let $q$ be a common multiple of the $y$'s that arise in this way. Thus, if the complement of $L_2$ had the pumping property, we would get that any nonsquare number $m\geq p$ would have that $m+kq$ is also a nonsquare for any $k$.

But this is false, since we can let $m=q^2+1$, which is not a square, but $m+2q=q^2+1+2q=(q+1)^2$, which is a square. So the complement of $L_2$ does not have the pumping property.

General answer. The same idea, of finding a uniform pumping string, leads to the following theorem, which seems to be the answer to the general question you are asking. Let us say officially that a language $L$ has the pumping property, if there is a number $p$, such that for any string $w$ of length at least $p$, we can write $w=xyz$, where $y$ is nontrivial and $xy$ has length at most $p$, such that $w\in L\iff xy^kz\in L$ for any $k$.

Theorem. Suppose that $L$ is a language in an alphabet with a single letter $a$. Then the following are equivalent:

  1. $L$ is regular.
  2. $L$ has the pumping property.
  3. $L$ is eventually periodic. That is, $L=\{a^k\mid k\in A\}$, where $A\subset\mathbb{N}$ is an eventually periodic set of natural numbers.

Proof. The implication ($1\to 2$) holds irrespective of the single-letter assumption.

($2\to 3$) Suppose that $L$ has the pumping property. So there is a pumping number $p$, such that any sufficiently string $w$ of length at least $p$ can be expressed $w=xyz$, where $y$ is nontrivial of length at most $p$, such that $w\in L$ just in case $xy^kz\in L$ for all $k$. Since $y$ must be a power of $a$, and there are only finitely many possibilities, we may find a common multiple of these $y$, to find a single uniform nontrivial $y$, such that for every sufficiently long string $w$, we have $w\in L\iff wy^k\in L$ for all $k$. (I am also using that the language is commutative.)

This implies that the language $L$ is eventually periodic. That is, the exponents that arise for words in $L$ is an eventually periodic set of natural numbers.

($3\to 1$) This is easy, since one may design a finite-state automata consisting of a single long chain, looping back on itself once with the right period. QED

Related Question