I need a regular expression to match a
, b
or a;b
.
I cannot write a|b|a;b
because a
and b
contain named groups and if I try to do this I get an Exception:
redefinition of group name 'a' as group 8; was group 3 at position 60.
a;?b
does not work either because ab
must not be matched.
How would you solve this?
Is this possible with the re
library?
I have heard there is also a library called pyparsing
. Would that be better suited for this problem?
Background: This is a follow up question to this one. Because it does not seem to be possible to pass through color codes in urwid or curses I am trying to decode the color codes I am getting from git so that urwid can reencode these colors.
To avoid problems with copy & paste I am leaving out the leading control character in the following regular expressions:
Working regex, except that it does not match [1m
(bold) which is used in a test program:
reo_color_code = re.compile(
r'\['
r'((?P<series>[01]);)?'
r'((?P<fgbg>[34])(?P<color>[0-7]))?'
r'm'
)
Not compiling regex:
reo_color_code = re.compile(
r'\['
r'('
r'((?P<series>[01]))'
r'|'
r'((?P<fgbg>[34])(?P<color>[0-7]))'
r'|'
r'((?P<series>[01]));((?P<fgbg>[34])(?P<color>[0-7]))'
r')'
r'm'
)
Throws the exception
re.error: redefinition of group name 'series' as group 8; was group 3 at position 60
What I'd do in this case wouldn't be try to build a single regex to solve the entire problem, instead I'd implement a method like the following (also using re
but at different levels):
def get_info(s):
if s.startswith('[') and s.endswith('m'):
p = s[1:-1]
if ';' in p:
m = re.match('^([01]);([34])([0-7])$', p)
else:
m = re.match('^([01])$|^([34])([0-7])$', p)
if m:
return tuple(m.groups())
return None, None, None
You can use it like:
>>> serie, fgbg, color = get_info('[1;37m')
>>> serie, fgbg, color
('1', '3', '7')
PS: Didn't do too many tests. Hope it helps.