Search code examples
regexregular-languageautomata

Simplify Regular Expression, Automata Theory


I recently needed to figure out the regular expression for the language {w | |w| is odd and w starts and ends with symbol b} over the alphabet {a,b}.

I figured out a solution being

b(ab+bb+aaab+aabb+abab+abbb+bbbb+bbab+babb+baab)*

The solution is very long, so I was hoping someone could tell me a way this can be simplified,


Solution

  • You could try b|(b(a|b)((a|b)(a|b))*b) The language you describe can produce just b, and that's captured by the first branch, and this is the only size 1 string it can produce. Then it can produce size 3,5,7,... strings. Those are captured by the second branch. The b at the start and end are self-explanatory. And those account for 2 characters. The middle bit is the classic language {w | |w| is odd} over {a,b}.

    A more powerful syntax would allow something like b((a|b)((a|b)^2)*b)?, which is more compact, but equivalent.