Search code examples
pythontextinsert

Python: removing only the first instance of a tag from a text file?


I've got a file that at the moment reads along these lines:

Text
Text
</tag> <tag> Line
Text
Text
</tag> <tag> Line
Text
</tag>
etc.

I'd like to remove only the first instance of the /tag (as obviously this is wrong and shouldn't be there).

So far I've tried something along the lines of:

with open(document.txt, r+) as doc:
   for line in doc:
      line = line.replace("</tag>", " ")
   doc.write(line)

but this doesn't seem to do anything to the file.

I've also tried a different method that involves effectively not inserting the first /tag before I insert the rest (as I'm the one inserting /tag tag into the document), by:

#insert first tag
with open('document.txt', 'r+') as doc:
   for line in doc:
      with open('document.txt', 'r') as doc:
         lines = doc.readlines()

      if line.__contains__("Line"):
         lines.insert(3, "<tag>")

      with open(document.txt', 'w') as doc:
         contents = doc.writelines(lines)

#insert other sets of tags
with open('document.txt', 'r+') as doc:
    for line in doc:
        with open('document.txt', 'r') as doc:
                lines = doc.readlines()
        
        for index, line in enumerate(lines):      
            if line.__contains__("Line") and not line.__contains__("<tag>"):
                lines.insert(index, "</tag> <tag>")
                break
    
        with open('document.txt', 'w') as doc:
            contents = doc.writelines(lines)

This again however seems to just give me the same result as before - with all of the tags, including the first /tag.

Can anyone point me in the right direction to fix this? Apologies if the above is shoddy coding and there's a simple fix.

Thanks in advance


Solution

  • str.replace(old, new [, count]) takes an optional argument count, which replaces only the first count occurrences:

    filename = "file.txt"
    data = open(filename).read()
    data = data.replace("</tag>", " ", 1)
    with open(filename, "w") as doc:
        doc.write(data)
    print(open(filename).read())
    

    Out:

    Text
    Text
      <tag> Line
    Text
    Text
    </tag> <tag> Line
    Text
    </tag>
    etc.