[Math] Stronger Pumping Lemma for Context Free Languages

automatacomputer sciencecontext-free-grammarformal-languagespumping-lemma

Hi Math Stack Exchange,

Taking a class in automata theory, and having real trouble proving the following strong automata theorem for context free languages (from Sipser, Problem 2.37):

If L is a context-free language, there exists a number k such that any string $s ∈ L$, if $|s| ≥ k$, then we can write $s = uvxyz$ where

  1. for each $i ≥ 0$, $uv^ixy^iz \in L$
  2. $v \neq \varepsilon$ and $y \neq \varepsilon$
  3. $|vxy| ≤ k.$

I tried modifying the original proof of the he pumping lemma, attempting to exploit the structure of parse trees. First, put the grammar in Chompsky normal form, so the parse tree is a binary tree. If the word is big enough, there is a sequence

$$ V_0, V_1, \dots, V_n $$

Such that $V_i$ is the parent of $V_{i+1}$ in the parse tree. If $n$ is made big enough, then some variable repeats itself. Provided that we take left and right derivations, we obtain nonempty $v$ and $y$ to pump. I am left with the case where we only take left derivations, or only take right derivations. I assume there's some trick at this point, but I've given it a few hours thought and can't come up with anything. Any hints?

Best Answer

Found the trick to finish the proof. We just need to increasing the size of the pump. Let $v$ be the number of variables in a Chompsky normal form grammar. If $2^v$ is the ordinary pumping length. Consider a new length $2^{3v}$. In the parse tree of a long string (of minimal length) of length greater than $2^{3v}$, there must be a path of height $3v$. Thus some variable must occur 3 times. Then use the minimality of the parse tree to conclude that the stepwise retiteration can be pumped.

Related Question