Prove that $L=\big\{\langle G\rangle \mid G\text{ is a CFG over }\Sigma=\{0,1\}\text{ and }1^* \cap L(G)\ne\varnothing\big\}$ is decidable.

context-free-grammardecidabilityformal-languages

How to prove that $L = \big\{\langle G\rangle \mid G \text { is a CFG over } \Sigma = \{0,1\} \text { and } 1^* \cap L(G) \ne \varnothing\big\}$ is decidable? I know I am supposed to prove that it is decidable or not $ L $ contains some string of the language $1^*$ and I know that CFLs are not closed under intersection and that $E_{TM}$ is not decidable but I am having trouble constructing the proof knowing this. If anyone could point me in the right direction, I would appreciate it. Thank you.

Best Answer

Hint. The intersection of a context-free language with a regular language is context-free.

Related Solutions

Context-Free Grammar – CFG for Language

Simple solution

$$S\to E|E1S\\ E \to 0E1E|1E0E|\varepsilon$$

It's clear that anything generated by $E$ will verify $|u|_0 = |u|_1$ and therefore that anything generated by $S$ will verify $|u|_0 \le |u|_1$.

Claim A: $u\in \mathcal L(E)\iff |u|_0 = |u|_1$

Proof A: By induction on $|u|$.

Since $u\in \mathcal L(E)\iff \overline{u}\in \mathcal L(E)$ (where $\overline{u}$ is $u$ with $0$ and $1$ swapped), we can assume WLOG that $u=1v$. Take $w_1$ the greatest prefix of $v$ so that $|w_1|_0=|w_1|_1$. It exists because $\varepsilon$ is one such prefix. Write $v=w_1w_2$ where $w_2$ is just whatever remains. Since we took $w_1$ the greatest with the property, any prefix of $w_2$ has more $0$s than $1s$. In particular, $w_2$ starts with a $0$. So $w_2=0h$. $w_1$ and $h$ both have as many $0$s as $1$s so by the induction hypothesis, $E\to^*w_1$ and $E\to^*h$. So $E\to0E1E\to^*0w_11h=u$ and we're done.

Claim B: $u\in\mathcal L(S)\iff |u|_0\le |u|_1$

Proof B: Left to the reader. One direction come directly from claim A and the other is by induction on $|u|_1-|u|_0$ (and by taking the greatest prefix for which $|u|_0= |u|_1$ to find the part that comes from $E$).

$\Sigma:= \{0,1\}$

If you want to have more $1$s that $0$s, you just have to make sure that whenever you add a $1$ you add a $0$.

This is done with $I$ that means "add one more $1$ than $0$s" (except that it only generates words for which this is false for all their prefixes). Similarly $O$ means "add one more $0$ than $1$s". $U$ therefore means "bounce upwards and come back to where you were" and $D$ "bounce downwards and come back to where you were". $E$ means "bounce upwards or downwards and come back to where you were as many times as you want". $T$ means "(Go one step up or bounce upwards and come back to where you were) as many times as you like". With this interpretation, it feels plausible that $S$ generates what we want because either you take $E$ and it means that you'll have equality, or you take $E1T$ and you'll get strict inequality because once you're done with $E$, you go up once and then $T$ doesn't let you go down so you'll be strictly above the $|v|_1-|v|_0=0$ line.

Here is an example of how $|v|_1-|v|_0$ can evolve with the prefix $v\le u$.

111000000110100111111000111001110                                                              

                               /\                                       
  /\                /\    /\  /                                              
 /  \              /  \  /  \/                                               
/____\____________/____\/_______________________________       
      \          /                                                        
       \  /\/\  /                                                         
        \/    \/  
1I----0I----------1O----11O--111O
U-----D-----------U-----1U---11U-
E-----------------------1T-------

I'll remove $U$ and $D$ from the grammar and will prove it for this one but it should be clear that those two grammars are equivalent.

Claim 1: Define $H(u)$:$|u|_0 > |u|_1$ and for any strict prefix $v<u$, $|u|_0 \le |u|_1$. Then for any $u\in\Sigma^*$,$u\in \mathcal L(O) \iff H(u)$. Note that $H(u)$ implies $H'(u)$:"$|u|_0=|u|_1+1$" and that the last letter of $u$ is $0$.

Proof 1:

"$\Rightarrow$": Let $P(n)$: "If $O\to^m u\in \Sigma^*$ for some $m\le n$, then H(u)". We prove $P(n)$ by induction on $n\ge 1$.
- If $O\to^1 u\in \Sigma^*$, then the production must be $O\to 0$ so that $u=0$ which verifies the properties.
- If $0\to^{n+1} u \in \Sigma^*$ for $n>0$, then there is $v\in (\Sigma\sqcup \Gamma)^*$ so that $I\to v \to^n u$. The first production can only be $O\to 1OO$ so that $v=1OO$. By the fundamental lemma, $u$ can be written as $u=u_1u_2u_3$ so that $1\to^{n_1} u_1$, $O\to^{n_2} u_2$, $O\to^{n_3} u_3$, $n=n_1+n_2+n_3$. Then $H(u)$ follows from $H(u_2)$ and $H(u_3)$ (which are given by the induction hypothesis because $\forall i, n_i \le n$) and $u=0u_2u_3$.
"$\Leftarrow$": Let $P(n)$: "For any $u\in \Sigma^*$ so that $H(u)$, $u\in \mathcal L(O)$". We prove $P(n)$ by induction on $n$.
- $P(0)$ is vacuously true.
- Suppose $P(n)$. Let $u\in \Sigma^*$ so that $|u|=n+1$ and $H(u)$. Then $u=av$ for some $a\in \Sigma$ and $v\in\Sigma^*$.
- If $a=0$, then $a<u$ would lead to $1=|0|_0\le |0|_1=0$. So we have $0=a=u$ the derivation $O\to u$ works.
- If $a=1$, then we take $w_1$ to be the shortest prefix of $v$ so that $H(w_1)$ and we take $w_2$ so that $u=1w_1w_2$ holds. Since $H(u)$ and $H(w_1)$, we also have $H'(u)$ and $H'(w_1)$. This gives us $H'(w_2)$ and so $|w_2|_0>|w_2|_1$. Take $p<w_2$. Then $1w_1p<u$ so that $|w_1|_1+1+|p|_0=|w_1|_0+|p|_0=|1w_1p|_0\le |1w_1p|_1=1+|w_1|_1+|p|_1$ and we indeed have $|p|_0\le |p|_1$.

Claim 2: Define $\overline{H}(u)$:$|u|_1 > |u|_0$ and for any strict prefix $v<u$, $|u|_1 \le |u|_0$. Then for any $u\in\Sigma^*$,$u\in \mathcal L(I) \iff \overline{H}(u)$. Note that $\overline{H}(u)$ implies $\overline{H}'(u)$:"$|u|_1=|u|_0+1$" and that the last letter of $u$ is $1$.

Proof 2: Similar to proof 1.

Claim 3: $u\in \mathcal L(T)\iff$ for any prefix $v\le u$, $|v|_1 \ge |v|_0$.

Proof 3:

"$\Rightarrow$": Follows from claim 1 by induction on the size of the derivation.
"$\Leftarrow$": Take $u\in\Sigma^*$ so that for any prefix $v\le u$, $|v|_1 \ge |v|_0$. First note that the first letter of $u$ must be $1$ (if $u$ is not $\varepsilon$). We will prove it by induction on $|u|$.
- If $u=\varepsilon$, we have $T\to u$.
- If it exists, take $v$ to be the smallest prefix of $u$ so that $|v|_0=|v|_1$. Then $v=1v'$. Since we took the smallest prefix, we have $H(v')$ and we therefore have $O\to^*v'$ by claim 1. Write $u=vw$ for some $w$. Then by the induction hypothesis, $T\to^* w$ and then $T\to^*1OT\to^*1v'w=u$.
- If it doesn't exist, write $u=1v$. The fact that the prefix doesn't exist implies that $v$ also verifies the property. By the induction hypothesis, we therefore have $T\to^*v$. And we conclude that $T\to 1T\to^*1v=u$.

Claim 4: $u\in\mathcal L(E) \iff |u|_0=|u|_1$

Proof 4: Left to the reader. One implication follows directly from claim 1 and 2 and the other implication is a case analysis on the first letter.

Claim 5: $u\in \mathcal L(S)\iff |u|_0\le |u|_1$

Proof 5: Left to the reader. One implication follows directly from claims 3 and 4 and the other one has two cases: $|u|_0=|u|_1$ or not, and if not $u$ is factored into $u_11u_2$ where $u_1$ is the greatest prefix so that $|u_1|_0=|u_1|_1$.

Bonus claim: The grammar is non-ambiguous.

Bonus proof: Left to the reader. The idea is that the languages generated by $E$ and $E1T$ will be disjoint so you only have to check that each part is non-ambiguous separately. The part of the grammar used by $E$ is $LL(1)$ so it's non-ambiguous. Then for a word $u$, suppose you can write it as $e_11t_1=u=e_21t_2$ with $E\to^*e_i$ and $T\to^*t_i$. $e_i$ is the longest prefix $v<u=e_i1t_i$ so that $|v|_0=|v|_1$ because $|e_i|_0=|e_i|_1$ and for each prefix $w<t_i$, $|w|_0\le|w|1$. This implies that $e_1=e_2$ and therefore that $t_1=t_2$. By a previous argument, there is only one derivation tree for $e_1$ so all that remains to be checked is that there is only one for $t_1$, or in other words that the part of the grammar used by $T$ is non-ambiguous. It is because you can only take $T\to1OT$ if you're going to come back to the level you are at and $T\to1T$ if you're not.

[Math] For $\Sigma = \{ 0,1 \}$, $A$ has strings which contain a $1$ in their middle third, and a $B$ which contain two $1$’s in their middle third.

18 April 2022: The answer given for $A$ was absurd, and I’ve deleted it.

9 May 2022: I’ve now added a correct answer in the blockquote below.

Consider the context-free grammar with initial symbol $S$ and the following productions:

$$\begin{align*} S&\to XLX\mid XRX\\ L&\to XLXX\mid C\mid 1\\ R&\to XXRX\mid C\mid 1\\ C&\to XCX\mid 1\\ X&\to 0\mid 1 \end{align*}$$

Show that the derivations that begin with the production $S\to XLX$ generate $$\{u1v\in\Sigma^*:|u|\le|v|<2|u|\}\,,\tag{1}$$ and those that begin with the production $S\to XRX$ generate $$\{u1v\in\Sigma^*:|v|\le|v|<2|v|\}\,.\tag{2}$$

Show that $(1)$ is the set of strings $w=u1v\in\Sigma^*$ such that the indicated $1$ is in the middle third of $w$, and $|u|\le|v|$, and $(2)$ is the set of strings $w=u1v\in\Sigma^*$ such that the indicated $1$ is in the middle third of $w$, and $|v|\le|u|$. Clearly $A$ is the union of these two sets.

To show that $B$ is not context-free, use the pumping lemma with the word $s=0^{p+2}10^p10^{p+2}$, where $p$ is the pumping length. You’ll have $s=uvwxy$, where $|vwx|\le p$, $|vx|\ge 1$, and $uv^kwx^ky\in B$ for all $k\ge 0$, and you’ll have to consider several cases according to the possible decompositions of $s$. Note that since $|vwx|\le p$, at most two of the strings of $0$s can be pumped.

Best Answer

Related Solutions

Context-Free Grammar – CFG for Language

[Math] For $\Sigma = \{ 0,1 \}$, $A$ has strings which contain a $1$ in their middle third, and a $B$ which contain two $1$’s in their middle third.

Related Question