Search code examples
regexreplaceparagraph

How to replace spaces in "§ 1", "§s 1, 2, [indeterminate number]" all with underscores in regex?


I have a simple regex pattern that replaces a space between paragraph sign and a number with an underscore: Find: § (\d+) Replace: §_$1

It works fine and turns "§ 5" into "§_5", but here are cases when there are more numbers following the paragraphs sign, so how to turn "§s 23, 12" also into "§s_1,_2_and_3"? Is it possible with regex?

I have tried modifying the original regex pattern, (§|§-d) (\d+) finds only some of the cases, and I have no idea how to write the replacement.

Thanks in advance!


Solution

  • You can use

    re.sub(r'§\w*\s+\d+(?:,\s*\d+)*', lambda x: re.sub(r'\s+', '_', x.group()), text)
    

    The re.sub(r'\s+', '_', x.group() replaces chunks of one or more whitespaces with a single _ char.

    See the regex demo, details:

    • § - a § char
    • \w* - zero or more word chars
    • \s+ - one or more whitespaces
    • \d+ - one or more digits
    • (?:,\s*\d+)* - zero or more sequences of a comma, zero or more whitespaces and one or more digits.

    See the Python demo:

    import re
    text = "§s 23, 12"
    print(re.sub(r'§\w*\s+\d+(?:,\s*\d+)*', lambda x: re.sub(r'\s+', '_', x.group()), text))
    # => §s_23,_12