Search code examples
context-free-grammarregular-language

Computer Theory: CFG and Regular Languages


I am in the process of learning computer theory from a book by Daniel I.A. Cohen. My first problem is that I don't fully understand the words being generated by a regular expression.

Let's take this for example: (aaa+b)*

My understanding is that this will generate either:

  • No words - because the entire expression is followed by *, which is 0 or more occurrences.
  • Words of this form: aaab aaabaaab aaabaaabaaab, i.e, the same expression repeated infinite times.

Please advise if my thinking on this is correct and help me understand this better if I'm off track.

Next, in general, for a given regular expression, how do I go about converting it to a CFG? Is there a formula or some kind of standard practice to do so? You may use the above regular expression or any expression for that matter as an example.

Thanks in advance!

Based on what has been discussed in the answer and comments to that answer, I assume the FA looks something like this:

FA


Solution

  • Regular expression is a vast topic. Not everything can be explained in one post. I suggest you read about regex from Regular-Expressions.info

    Here I will explain what your patterns will match in context of the quantifiers you used.

    Quantifiers:

    • + will match one or more patterns.

    • * will match 0 or more.

    Regex: (aaa+b)*

    Explanation:

    • Your first point is correct that it will match nothing as * quantifier is used.

    • However your second point is incomplete. As you can see + is after a. It will be applied to only last a and it means one or more a. So following patterns will also be a match.

    • aaaab: notice that there are first three a and then one a which matches the repetition caused by +

    • aaaaaabaab: Here the first aaaaaab will be submatch of the whole match.

    You can better understand visually here.