I have this regular expression:
\ba\.?b\.?c\.?\b( something)?
that matches
I use it 2 times in order of importance: first I try to add ^
at the begin and $
at the end of the line because I'd like to find a string exactly those cases above. If nothing is found, the constraints are removed and I accepted strings like
The problem is in the first case with a.b.c.
, where the \b
mess with the $
.
So if I use
^\ba\.?b\.?c\.?\b( something)?$
the simple a.b.c.
is not matched because the part in the round brackets is "ignored" and the \b
near to the $
has a behavior that I cannot understand. On the other hand a.b.c
(without the last dot) will match
If I change the second \b
with \W
everything works but I'm not sure I will match other unwanted string. Any ideas of how I can resolve this with only one regular expression?
I'm using Python if this can be relevant
The problem simply comes from the meaning of \b
(see source). This part \.\b$
will never match anything, as there is no word boundary position to match (the position between a dot and the end of the string is not a word boundary position).
You should try:
^\ba\.?b\.?c\.?(?:\b|$)
instead.
With the "something" part, it'd give:
^\ba\.?b\.?c\.?(?:\b|$)( something)?$
(there's maybe some improvement to do here, but it should work)