Search code examples
pythonftpsftpparamikoftplib

Python - Upload a in-memory file (generated by API calls) in FTP by chunks


I need to be able to upload a file through FTP and SFTP in Python but with some not so usual constraints.

  1. File MUST NOT be written in disk.

  2. The file how it is generated is by calling an API and writing the response which is in JSON to the file.

  3. There are multiple calls to the API. It is not possible to retrieve the whole result in one single call of the API.

  4. I can not store in a string variable the full result by doing the multiple calls needed and appending in each call until I have the whole file in memory. File could be huge and there is a memory resource constraint. Each chunk should be sent and memory deallocated.

So here some sample code of what I would like to:

def chunks_generator():
    range_list = range(0, 4000, 100)
    for i in range_list:
        data_chunk = requests.get(url=someurl, url_parameters={'offset':i, 'limit':100})
        yield str(data_chunk)
    
def upload_file():
    chunks_generator = chunks_generator()
    for chunk in chunks_generator:
        data_chunk= chunk
        chunk_io = io.BytesIO(data_chunk)
        ftp = FTP(self.host)
        ftp.login(user=self.username, passwd=self.password)
        ftp.cwd(self.remote_path)
        ftp.storbinary("STOR " + "myfilename.json", chunk_io)

I want only one file with all the chunks appended. What I have already and works is if I have the whole file in memory and send it at once like this:

string_io = io.BytesIO(all_chunks_together_in_one_string)
ftp = FTP(self.host)
ftp.login(user=self.username, passwd=self.password)
ftp.cwd(self.remote_path)
ftp.storbinary("STOR " + "myfilename.json", string_io )

I need this in ftplib but will need it in Paramiko as well for SFTP. If there are any other libraries that this would work better I am open.


Solution

  • You can implement file-like class that upon calling .read(blocksize) method retrieves data from requests object.

    Something like this (untested):

    class ChunksGenerator:
        i = 0
        requests = None
    
        def __init__(self, requests)
            self.requests = requests
    
        def read(self, blocksize):
            # TODO: somehow detect end-of-file and return false in that case
            buf = requests.get(
                    url=someurl, url_parameters={'offset':self.i, 'limit':blocksize})
            self.i += blocksize
            return buf
    
    generator = ChunksGenerator(requests)
    ftp.storbinary("STOR " + "myfilename.json", generator)
    

    With Paramiko, you can use the same class with SFTPClient.putfo method.