I was looking at the question posed in this stackoverflow link (Regular expression for odd number of a's) for which it is asked to find the regular expression for strings that have odd number of a
over Σ = {a,b}
.
The answer given by the top comment which works is b*(ab*ab*)*ab*
.
I am quite confused - a
was placed just before the last b*
, does this ordering actually matter? Why can't it be b*a(ab*ab*)*b*
instead (where a
is placed after the first b*
), or any other permutation of it?
Another thing I am confused about is why it is (ab*ab*)*
and not (b*ab*ab*)*
. Isn't b*ab*ab*
the more accurate definition of 'having exactly 2 a
'?
Why can't it be
b*a(ab*ab*)*b*
instead?
b*a(ab*ab*)*b*
does not work because it would require the string to have two consecutive a
s before the first non-leading b
, wouldn't it? For example, abaa
would not be matched by your proposed regex when it should. Use the regex debugger on a site like Regex101 to see this for yourself.
On the other hand, moving the whole ab*
part to the start (b*ab*(ab*ab*)*
) works as well.
why it is
(ab*ab*)*
and not(b*ab*ab*)*
?
(b*ab*ab*)*
does work, but the first b*
is quite redundant because whatever b
there is left, will be matched by the last b*
in the group. There is also a b*
before the group, which causes the b*
to not be able to match anything, hence it is redundant.