Search code examples
pythonfilecopy

python script to concatenate all the files in the directory into one file


I have written the following script to concatenate all the files in the directory into one single file.

Can this be optimized, in terms of

  1. idiomatic python

  2. time

Here is the snippet:

import time, glob

outfilename = 'all_' + str((int(time.time()))) + ".txt"

filenames = glob.glob('*.txt')

with open(outfilename, 'wb') as outfile:
    for fname in filenames:
        with open(fname, 'r') as readfile:
            infile = readfile.read()
            for line in infile:
                outfile.write(line)
            outfile.write("\n\n")

Solution

  • Use shutil.copyfileobj to copy data:

    import shutil
    
    with open(outfilename, 'wb') as outfile:
        for filename in glob.glob('*.txt'):
            if filename == outfilename:
                # don't want to copy the output into the output
                continue
            with open(filename, 'rb') as readfile:
                shutil.copyfileobj(readfile, outfile)
    

    shutil reads from the readfile object in chunks, writing them to the outfile fileobject directly. Do not use readline() or a iteration buffer, since you do not need the overhead of finding line endings.

    Use the same mode for both reading and writing; this is especially important when using Python 3; I've used binary mode for both here.