python python-3.x regex regex-group regexp-replace

Replacing only the matched group in a file with multiple occurences

Input: /* ABCD X 1111 */ /* Comment 1111: [[reason for comment]] */

Output: /* ABCD X 1111 # [[reason for comment]] */

Regex used: regex = (?:[\/*]+\sPRQA[\s\w\,]*)(\*\/\s*\/\*\Comment[\w\,]+:)+(?:\s\[\[.*\/$)

How to use the above regex to replace the matched group with '#' in a file with multiple occurrences?

I tried with re.sub(regex, '#\1', file.read(), re.MULTILINE), but this will append # to the matched group.

Is there a direct way to do this instead of iterating line by line and then replacing?

Solution

You can use

re.sub(r'(/\*\s*ABCD[^*/]*)\*/\s*/\*\s*Comment[^*:]+:(\s*\[\[[^][]*]]\s*\*/)', r'\1#\2', file.read())

If you are sure these substrings only appear at the end of lines, add your $ anchor back and use flags=re.M:

re.sub(r'(/\*\s*ABCD[^*/]*)\*/\s*/\*\s*Comment[^*:]+:(\s*\[\[[^][]*]]\s*\*/)$', r'\1#\2', file.read(), flags=re.M)

See the regex demo. Details:

(/\*\s*ABCD[^*/]*) - Group 1 (\1): /*, zero or more whitespaces, ABCD, and then any zero or more chars other than * and /
\*/\s*/\*\s*Comment[^*:]+: - */, zero or more whitespaces, /, zero or more whitespaces, Comment, one or more chars other than * and : and then :
(\s*\[\[[^][]*]]\s*\*/) - Group 2 (\2): zero or more whitespaces, [[, zero or more chars other than [ and ], ]], zero or more whitespaces, */.

See Python demo:

import re
rx = r'(/\*\s*ABCD[^*/]*)\*/\s*/\*\s*Comment[^*:]+:(\s*\[\[[^][]*]]\s*\*/)$'
text = "Some text ... /* ABCD X 1111 */ /* Comment 1111: [[reason for comment]] */\nMore text here... Some text ... /* ABCD XD 1222 */ /* Comment 1112: [[reason for comment 2]] */"
print( re.sub(rx, r'\1#\2', text, flags=re.M) )

Output:

Some text ... /* ABCD X 1111 # [[reason for comment]] */
More text here... Some text ... /* ABCD XD 1222 # [[reason for comment 2]] */