I've got a file that at the moment reads along these lines:
Text
Text
</tag> <tag> Line
Text
Text
</tag> <tag> Line
Text
</tag>
etc.
I'd like to remove only the first instance of the /tag (as obviously this is wrong and shouldn't be there).
So far I've tried something along the lines of:
with open(document.txt, r+) as doc:
for line in doc:
line = line.replace("</tag>", " ")
doc.write(line)
but this doesn't seem to do anything to the file.
I've also tried a different method that involves effectively not inserting the first /tag before I insert the rest (as I'm the one inserting /tag tag into the document), by:
#insert first tag
with open('document.txt', 'r+') as doc:
for line in doc:
with open('document.txt', 'r') as doc:
lines = doc.readlines()
if line.__contains__("Line"):
lines.insert(3, "<tag>")
with open(document.txt', 'w') as doc:
contents = doc.writelines(lines)
#insert other sets of tags
with open('document.txt', 'r+') as doc:
for line in doc:
with open('document.txt', 'r') as doc:
lines = doc.readlines()
for index, line in enumerate(lines):
if line.__contains__("Line") and not line.__contains__("<tag>"):
lines.insert(index, "</tag> <tag>")
break
with open('document.txt', 'w') as doc:
contents = doc.writelines(lines)
This again however seems to just give me the same result as before - with all of the tags, including the first /tag.
Can anyone point me in the right direction to fix this? Apologies if the above is shoddy coding and there's a simple fix.
Thanks in advance
str.replace(old, new [, count])
takes an optional argument count
, which replaces only the first count occurrences:
filename = "file.txt"
data = open(filename).read()
data = data.replace("</tag>", " ", 1)
with open(filename, "w") as doc:
doc.write(data)
print(open(filename).read())
Out:
Text
Text
<tag> Line
Text
Text
</tag> <tag> Line
Text
</tag>
etc.