[Math] Find a CFG for even string and at least one 1’s in the left part

automatacontext-free-grammarformal-languages

I am preparing an automata exam and cannot figure out the CFG for the given language.

Find a context-free grammar for L = {w should be even and at least one 1's in the left part of the string}

I haven't any idea about how I can solve this question because I haven't seen this kind of CFG problem.

Apart from the direct answer, If someone can explain the way how we need to approach these kinds of CFG questions.

Best Answer

Here's how I think about this problem.

  1. If you just wanted to make a CFG for "even length strings", you could do this:

    $$S \mapsto 0S0 | 0S1 | 1S0 | 1S1 | \epsilon$$

    This builds the string "from the middle": at every step of the derivation, you can choose to add a new character to the left and right of the string and recurse.

  2. When deriving a string using that CFG, the characters on the left of $S$ end up in the first half of the string.

    So $1S0$ and $1S1$ are the rules that put 1s in the first half of the string.

  3. This suggests a modification of our original CFG. A string will have a 1 in the first half if we use the rule $S\mapsto 1S0$ or $S\mapsto 1S1$ at some point during the derivation, otherwise it won't. Let's introduce a new symbol to keep track of whether we've used such a rule or not.

    In our new CFG, $S$ will represent "A string where we haven't seen a 1 in the first half so far" and $T$ will represent "A string where we have seen a 1 in the first half". Both symbols will use similar rules that force the strings to be even length.

    $$S \mapsto 0S0 | 0 S1 | 1\color{red} T0 | 1\color{red} T1$$ $$\color{red} T \mapsto 0\color{red} T0 | 0\color{red} T1 | 1\color{red} T0 | 1\color{red} T1 | \epsilon$$

    This gives us the desired behavior. Note that the derivation keeps looping through $S$ until finally a one is added on the left. Then the derivation goes to $T$, where it can stop $(T\mapsto \epsilon)$ or loop even further.


So the tricks I use are:

  1. Observe that the language has two parts: even length strings and having a 1 in the first half.

  2. Simplify the problem by trying to make a CFG for just one part. Maybe you can adapt it to solve the entire problem.

  3. Because "even length strings" seems more fundamental, try that part first.

  4. A grammar for even-length strings can be made in a couple of ways. There's the infix way, which we saw above. There's also the prefix way: $S\mapsto 00S | 01S | 11S | 10S | \epsilon$.

    The infix way is useful for most common tasks, such as making palindromes. The prefix way is useful for some other cases, but in general I'd try the infix way first.

  5. The infix approach to even-length strings is convenient because it exposes a way to get at the other property (having a one in the first half). You can note that some productions put a one in the first half and some don't.

  6. You can use symbols like $S$ and $T$ to build up the pieces of a complicated structure.

    As a simple example, imagine making a language for $1^a0^b1^a$, which is a string that starts and ends with the same number of ones, with any number of zeroes in the middle. To make a grammar for this language, you can have one symbol $S$ which manages the "matching ones" component, and another symbol $T$ which manages the "zeroes" component:

    $$S \mapsto 1S1 | T$$ $$T \mapsto 0T | \epsilon $$

Related Question