I have a simple regex pattern that replaces a space between paragraph sign and a number with an underscore: Find: § (\d+) Replace: §_$1
It works fine and turns "§ 5" into "§_5", but here are cases when there are more numbers following the paragraphs sign, so how to turn "§s 23, 12" also into "§s_1,_2_and_3"? Is it possible with regex?
I have tried modifying the original regex pattern, (§|§-d) (\d+) finds only some of the cases, and I have no idea how to write the replacement.
Thanks in advance!
You can use
re.sub(r'§\w*\s+\d+(?:,\s*\d+)*', lambda x: re.sub(r'\s+', '_', x.group()), text)
The re.sub(r'\s+', '_', x.group()
replaces chunks of one or more whitespaces with a single _
char.
See the regex demo, details:
§
- a §
char\w*
- zero or more word chars\s+
- one or more whitespaces\d+
- one or more digits(?:,\s*\d+)*
- zero or more sequences of a comma, zero or more whitespaces and one or more digits.See the Python demo:
import re
text = "§s 23, 12"
print(re.sub(r'§\w*\s+\d+(?:,\s*\d+)*', lambda x: re.sub(r'\s+', '_', x.group()), text))
# => §s_23,_12