I need to identify the set of all possible strings over $\Sigma = \{a, b\}$ that do not contain the substring $aaa$.
I have becoming more familiar with regular-language and languages explored in Automata Theory, and have been defining a particular language through regular-expressions.
I am having a hard time trying to find the set of strings of $\Sigma^*$ (where $\Sigma = \{a, b\}$) that do not contain the substring aaa.
Any advice on how to generate the regular-expression that will define this language would be helpful.
Best Answer
We can describe the set of valid strings as the strings built from the alphabet $V=\{a,b\}$ which contain at most one or two consecutive $a$'s.
$$ $$
In order to do so all we need to know is the geometric series expansion since the $star$ operator \begin{align*} a^*=\left(\varepsilon|a|a^2|a^3|\cdots\right)\qquad\text{ translates to }\qquad 1+a+a^2+a^3+\cdots=\frac{1}{1-a} \end{align*}
Accordingly $a^+=aa^*$ translates to $\frac{a}{1-a}$ and alternatives like $(\varepsilon|a|aa)$ can be written as $1+a+a^2$.
So, out of $2^5=32$ binary words of length $5$ there are $8$ invalid words marked blue in the table below.
\begin{array}{cccc} \color{blue}{aaaaa}\qquad&ab\color{blue}{aaa}\qquad&ba\color{blue}{aaa}\qquad&bb\color{blue}{aaa}\\ \color{blue}{aaaa}b\qquad&abaab\qquad&b\color{blue}{aaa}b\qquad&bbaab\\ \color{blue}{aaa}ba\qquad&ababa\qquad&baaba\qquad&bbaba\\ \color{blue}{aaa}bb\qquad&ababb\qquad&baabb\qquad&bbabb\\ aabaa\qquad&abbaa\qquad&babaa\qquad&bbbaa\\ aabab\qquad&abbab\qquad&babab\qquad&bbbab\\ aabba\qquad&abbba\qquad&babba\qquad&bbbba\\ aabbb\qquad&abbbb\qquad&babbb\qquad&bbbbb\\ \end{array}