I was practicing re module and encountered an interesting problem.
I can easily substitute two words:
re.sub("30 apples", r"apples 30", 'Look 30 apples.') # 'Look apples 30.'
But I want to swap the two words only if 30 comes before apples.
How to do this?
I tried look behind method:
re.sub('(?<=\d\d) apples', r'\2 \1', 'Look 30 apples.')
But it does not take groups \1 and \2.
When you use a (?<=\d\d) apples
pattern the match starts right after 2 digits and is a space plus apples
. If you try to swap the two values, you need to consume both, and the lookbehind, as you see, does not consume text.
Thus, you need to use capturing groups here in the pattern and replace with the corresponding backreferences:
result = re.sub(r"(\d+)(\s+)(apples)", r"\3\2\1", 'Look 30 apples.')
See the regex demo. Regulex graph:
Details
(\d+)
- Capturing group 1 (\1
in the replacement pattern): one or more digits(\s+)
- Capturing group 2 (\2
in the replacement pattern): one or more whitespaces(apples)
- Capturing group 3 (\3
in the replacement pattern): apples
.import re
result = re.sub(r"(\d+)(\s+)(apples)", r"\3\2\1", "Look 30 apples.")
print(result)