Search code examples
pythonregexcommentsline-breaks

Regex to Remove Block Comments but Keep Empty Lines?


Is it possible to remove a block comment without removing the line breaks with a regex?

Let's say I have this text:

text = """Keep this /* this has to go
this should go too but leave empty line */
This stays on line number 3"""

I came up with this regex:

text = re.sub(r'/\*.*?\*/', '', text, 0, re.DOTALL)

But this gives me:

Keep this 
This stays on line number 3

What I want is:

Keep this

This stays on line number 3

Can it be done?


Solution

  • We can make a slight change to your current logic and use a lambda callback as the replacement for re.sub:

    import re
    
    text = """Keep this /* this has to go
    this should go too but leave empty line */
    This stays on line number 3"""
    
    text = re.sub(r'/\*.*?\*/', lambda m: re.sub(r'[^\n]+', '', m.group()), text, flags=re.S)
    print(text)
    

    This prints:

    Keep this 
    
    This stays on line number 3
    

    The replacement logic in the lambda function operates on the /* ... */ comment block. It strips off all characters except for newlines, leaving the newline structure intact while removing all other content from the intermediate comment lines.