Search code examples
pythonftpzipftplib

Retrieve file from FTP and directly write into a zip archive


I want to download files from an FTP server and archive them locally in a (zip) archive.

It is well known how to download files and save them individually:

import ftplib
remote = ftplib.FTP(ftp_server) 
remote.login(username, password) 
for filename in file_list:
    remote.retrbinary("RETR " + filename, open(filename, 'wb').write)
remote.quit()

It is also well known how to add files to an archive:

import zipfile
archive = zipfile.ZipFile(archive_file)
archive.write(filename)
archive.close()

But it seems not possible to use both at the same time:

remote.retrbinary("RETR " + filename, archive.write(filename))

This leads to a FileNotFoundError, because filename has not been saved to a local (temporary) directory in between.

Is there a way to directly send the file stream from FTP into a zip archive? Or would it be more efficient to download all files straight, add them to the archive, and then delete the files? I would like to keep harddisk I/O as low as possible.


Solution

  • Download the file to memory and use ZipFile.writestr:

    import ftplib
    import zipfile
    from io import BytesIO
    
    # ...
    
    archive = zipfile.ZipFile(archive_file, "w")
    
    for filename in file_list:
        flo = BytesIO()
        ftp.retrbinary('RETR ' + filename, flo.write)
        archive.writestr(filename, flo.getvalue())
    
    archive.close()