Search code examples
pythonregexcapturing-group

Multiple capturing groups within non-capturing group using Python regexes


I have the following code using multiple capturing groups within a non-capturing group:

>>> regex = r'(?:a ([ac]+)|b ([bd]+))'
>>> re.match(regex, 'a caca').groups()
('caca', None)
>>> re.match(regex, 'b bdbd').groups()
(None, 'bdbd')

How can I change the code so it outputs either ('caca') or ('bdbd')?


Solution

  • You are close.

    To get the capture always as group 1 can use a lookahead to do the match and then a separate capturing group to capture:

    (?:a (?=[ac]+)|b (?=[bd]+))(.*)
    

    Demo

    Or in Python3:

    >>> regex=r'(?:a (?=[ac]+)|b (?=[bd]+))(.*)'
    >>> (?:a (?=[ac]+)|b (?=[bd]+))(.*)
    >>> re.match(regex, 'a caca').groups()
    ('caca',)
    >>> re.match(regex, 'b bdbd').groups()
    ('bdbd',)