I am in the process of learning computer theory from a book by Daniel I.A. Cohen. My first problem is that I don't fully understand the words being generated by a regular expression.
Let's take this for example: (aaa+b)*
My understanding is that this will generate either:
aaab
aaabaaab
aaabaaabaaab
, i.e, the same expression repeated infinite times.Please advise if my thinking on this is correct and help me understand this better if I'm off track.
Next, in general, for a given regular expression, how do I go about converting it to a CFG? Is there a formula or some kind of standard practice to do so? You may use the above regular expression or any expression for that matter as an example.
Thanks in advance!
Based on what has been discussed in the answer and comments to that answer, I assume the FA looks something like this:
Regular expression is a vast topic. Not everything can be explained in one post. I suggest you read about regex from Regular-Expressions.info
Here I will explain what your patterns will match in context of the quantifiers
you used.
Quantifiers:
+
will match one or more patterns.
*
will match 0 or more.
Regex: (aaa+b)*
Explanation:
Your first point is correct that it will match nothing
as *
quantifier is used.
However your second point is incomplete.
As you can see +
is after a
. It will be applied to only last a
and it means one or more a
.
So following patterns will also be a match.
aaaab
: notice that there are first three a
and then one a
which matches the repetition caused by +
aaaaaabaab
: Here the first aaaaaab
will be submatch of the whole match.