I need to update the last line from a few more than 2GB files made up of lines of text that can not be read with readlines()
. Currently, it work fine by looping through line by line. However, I am wondering if there is any compiled library can achieve this more efficiently? Thanks!
myfile = open("large.XML")
for line in myfile:
do_something()
Update: Use ShadowRanger's answer. It's much shorter and robust.
For posterity:
Read the last N bytes of the file and search backwards for the newline.
#!/usr/bin/env python
with open("test.txt", "wb") as testfile:
testfile.write('\n'.join(["one", "two", "three"]) + '\n')
with open("test.txt", "r+b") as myfile:
# Read the last 1kiB of the file
# we could make this be dynamic, but chances are there's
# a number like 1kiB that'll work 100% of the time for you
myfile.seek(0,2)
filesize = myfile.tell()
blocksize = min(1024, filesize)
myfile.seek(-blocksize, 2)
# search backwards for a newline (excluding very last byte
# in case the file ends with a newline)
index = myfile.read().rindex('\n', 0, blocksize - 1)
# seek to the character just after the newline
myfile.seek(index + 1 - blocksize, 2)
# read in the last line of the file
lastline = myfile.read()
# modify last_line
lastline = "Brand New Line!\n"
# seek back to the start of the last line
myfile.seek(index + 1 - blocksize, 2)
# write out new version of the last line
myfile.write(lastline)
myfile.truncate()