Search code examples
pythonpython-re

re.sub doesnt delete the pattern in a txt file


with open("C:\code\code\Music Bot\lyrics.txt", "w+") as f:
    f.write(re.sub("\[.*\]", "", f.read()))
f.close()

The code opens a txt file of song lyrics, and it should delete any section that has words inside of [], E.G. [verse]. i've tried to put these lines of code in different places in the entire code but nothing changes, ive checked with other people and the pattern stated also seems to be fine. any ideas?


Solution

  • You're writing the modified version of the file at the end, not the beginning. So the file will contain the original text, followed by the text with the bracketed text removed. You need to seek back to the beginning before writing, and truncate the file after writing.

    You should use r+ mode when opening the file, to read before writing. w+ empties the file when it opens it, it's used for writing before reading.

    You also should use a non-greedy quantifier. With greedy .*, it will remove everything from the first [ to the last ].

    with open(r"C:\code\code\Music Bot\lyrics.txt", "r+") as f:
        contents = f.read()
        contents = re.sub(r"\[.*?\]", "", contents)
        f.seek(0)
        f.write(contents)
        f.truncate()
    

    Use raw strings for pathnames and regular expressions, so the backslashes will be treated literally.