Search code examples
pythonregexlookbehind

Python regex lookbehind swap groups


I was practicing re module and encountered an interesting problem.

I can easily substitute two words:

re.sub("30 apples", r"apples 30", 'Look 30 apples.') # 'Look apples 30.'

But I want to swap the two words only if 30 comes before apples.

How to do this?

I tried look behind method:
re.sub('(?<=\d\d) apples', r'\2 \1', 'Look 30 apples.')

But it does not take groups \1 and \2.


Solution

  • When you use a (?<=\d\d) apples pattern the match starts right after 2 digits and is a space plus apples. If you try to swap the two values, you need to consume both, and the lookbehind, as you see, does not consume text.

    Thus, you need to use capturing groups here in the pattern and replace with the corresponding backreferences:

    result = re.sub(r"(\d+)(\s+)(apples)", r"\3\2\1", 'Look 30 apples.')
    

    See the regex demo. Regulex graph:

    enter image description here

    Details

    • (\d+) - Capturing group 1 (\1 in the replacement pattern): one or more digits
    • (\s+) - Capturing group 2 (\2 in the replacement pattern): one or more whitespaces
    • (apples) - Capturing group 3 (\3 in the replacement pattern): apples.

    Python demo:

    import re
    result = re.sub(r"(\d+)(\s+)(apples)", r"\3\2\1", "Look 30 apples.")
    print(result)