Search code examples
pythonio

Most efficient way to modify the last line of a large text file in Python


I need to update the last line from a few more than 2GB files made up of lines of text that can not be read with readlines(). Currently, it work fine by looping through line by line. However, I am wondering if there is any compiled library can achieve this more efficiently? Thanks!

Current approach

    myfile = open("large.XML")
    for line in myfile:
        do_something()

Solution

  • Update: Use ShadowRanger's answer. It's much shorter and robust.

    For posterity:

    Read the last N bytes of the file and search backwards for the newline.

    #!/usr/bin/env python
    
    with open("test.txt", "wb") as testfile:
        testfile.write('\n'.join(["one", "two", "three"]) + '\n')
    
    with open("test.txt", "r+b") as myfile:
        # Read the last 1kiB of the file
        # we could make this be dynamic, but chances are there's
        # a number like 1kiB that'll work 100% of the time for you
        myfile.seek(0,2)
        filesize = myfile.tell()
        blocksize = min(1024, filesize)
        myfile.seek(-blocksize, 2)
        # search backwards for a newline (excluding very last byte
        # in case the file ends with a newline)
        index = myfile.read().rindex('\n', 0, blocksize - 1)
        # seek to the character just after the newline
        myfile.seek(index + 1 - blocksize, 2)
        # read in the last line of the file
        lastline = myfile.read()
        # modify last_line
        lastline = "Brand New Line!\n"
        # seek back to the start of the last line
        myfile.seek(index + 1 - blocksize, 2)
        # write out new version of the last line
        myfile.write(lastline)
        myfile.truncate()