Search code examples
pythonpython-3.xregexregex-groupregexp-replace

Replacing only the matched group in a file with multiple occurences


Input: /* ABCD X 1111 */ /* Comment 1111: [[reason for comment]] */

Output: /* ABCD X 1111 # [[reason for comment]] */

Regex used: regex = (?:[\/*]+\sPRQA[\s\w\,]*)(\*\/\s*\/\*\Comment[\w\,]+:)+(?:\s\[\[.*\/$)

How to use the above regex to replace the matched group with '#' in a file with multiple occurrences?

I tried with re.sub(regex, '#\1', file.read(), re.MULTILINE), but this will append # to the matched group.

Is there a direct way to do this instead of iterating line by line and then replacing?


Solution

  • You can use

    re.sub(r'(/\*\s*ABCD[^*/]*)\*/\s*/\*\s*Comment[^*:]+:(\s*\[\[[^][]*]]\s*\*/)', r'\1#\2', file.read())
    

    If you are sure these substrings only appear at the end of lines, add your $ anchor back and use flags=re.M:

    re.sub(r'(/\*\s*ABCD[^*/]*)\*/\s*/\*\s*Comment[^*:]+:(\s*\[\[[^][]*]]\s*\*/)$', r'\1#\2', file.read(), flags=re.M)
    

    See the regex demo. Details:

    • (/\*\s*ABCD[^*/]*) - Group 1 (\1): /*, zero or more whitespaces, ABCD, and then any zero or more chars other than * and /
    • \*/\s*/\*\s*Comment[^*:]+: - */, zero or more whitespaces, /, zero or more whitespaces, Comment, one or more chars other than * and : and then :
    • (\s*\[\[[^][]*]]\s*\*/) - Group 2 (\2): zero or more whitespaces, [[, zero or more chars other than [ and ], ]], zero or more whitespaces, */.

    See Python demo:

    import re
    rx = r'(/\*\s*ABCD[^*/]*)\*/\s*/\*\s*Comment[^*:]+:(\s*\[\[[^][]*]]\s*\*/)$'
    text = "Some text ... /* ABCD X 1111 */ /* Comment 1111: [[reason for comment]] */\nMore text here... Some text ... /* ABCD XD 1222 */ /* Comment 1112: [[reason for comment 2]] */"
    print( re.sub(rx, r'\1#\2', text, flags=re.M) )
    

    Output:

    Some text ... /* ABCD X 1111 # [[reason for comment]] */
    More text here... Some text ... /* ABCD XD 1222 # [[reason for comment 2]] */