Search code examples
pythonregexbackreference

bug on module re in python (backreference)?


I want to match:

first second

and

second first

so the regular expression:

re.match(r'(?:(?P<f>first) (?P<s>second)|(?P=s) (?P=f))', 'first second')

matches, but this one:

re.match(r'(?:(?P<f>first) (?P<s>second)|(?P=s) (?P=f))', 'second first')

does not matches. Is this a bug on backreference in A|B ?


Solution

  • How about:

    (?=.*(?P<f>first))(?=.*(?P<s>second))
    

    (?=...) is a positive lookahead it assumes that the word first is present somewhere in the string without making it part of the match (it's a zero length assertion). It's the same for second.

    This regex is true if there is first and second in any order in the string.