Search code examples
pythonazurefilestream

Get Progress in Python file Upload to Azure


i'm uploading files to azure like so:

with open(tempfile, "rb") as data:
    blob_client.upload_blob(data, blob_type='BlockBlob',  length=None, metadata=None)

how can i get a progress indication? when i try uploading as stream, it only uploads one chunk.

i'm sure i'm doing something wrong, but can't find info.

thanks!


Solution

  • It looks like the Azure library doesn't include a callback function to monitor progress.

    Fortunately, you can add a wrapper around Python's file object which can call a callback everytime there's a read.

    Try this:

    import os
    from io import BufferedReader, FileIO
    
    
    class ProgressFile(BufferedReader):
        # For binary opening only
    
        def __init__(self, filename, read_callback):
            f = FileIO(file=filename, mode='r')
            self._read_callback = read_callback
            super().__init__(raw=f)
    
            # I prefer Pathlib but this should still support 2.x
            self.length = os.stat(filename).st_size
    
        def read(self, size=None):
            calc_sz = size
            if not calc_sz:
                calc_sz = self.length - self.tell()
            self._read_callback(position=self.tell(), read_size=calc_sz, total=self.length)
            return super(ProgressFile, self).read(size)
    
    
    
    def my_callback(position, read_size, total):
        # Write your own callback. You could convert the absolute values to percentages
        
        # Using .format rather than f'' for compatibility
        print("position: {position}, read_size: {read_size}, total: {total}".format(position=position,
                                                                                    read_size=read_size,
                                                                                    total=total))
    
    
    myfile = ProgressFile(filename='mybigfile.txt', read_callback=my_callback)
    
    

    Then you would do

    blob_client.upload_blob(myfile, blob_type='BlockBlob',  length=None, metadata=None)
    
    myfile.close()
    

    Edit: It looks like TQDM (progress monitor) has a neat wrapper: https://github.com/tqdm/tqdm#hooks-and-callbacks. The bonus there is that you get easy access to a pretty progress bar.