Search code examples
pythonmergefile-handling

Merging files on disk in Python


I am trying to merge key-value pairs stored in multiple files. The files themselves are large so I want to do this in memory. The files are sorted, and contain 1 key-value pair per line, so normal k-way merge would work but I don't know how to iterate through the files together. Also in case I encounter multiple data points with the same keys, I want to combine them (add their values together) so the simple solution in Implementing an external merge sort does not work.I have tried using readlines to iterate over the files and storing indices, but readlines loads the entire file into memory so it does not acheive my need. As clarified by comments, the question boils down to reading consecutive lines from multiple files without using a for line in file loop.


Solution

  • It seems your real question was "how do I arbitrarily read the next line from a file, without using a for line in file loop?"

    The answer to that is file.readline().