Search code examples
azure-functionsazure-blob-storagepickledill

Reading file from Azure blob storage using pickle or dill without saving to disk


I'm trying to read weights for a machine learning model from Azure Storage Blob in Python. This should be running in Azure Functions, so I don't believe I'm able to use methods which save the blob to disk.

I'm using azure-storage-blob 12.5.0, not the legacy version.

I've tried using Dill.loads to load the .pkl file, like so:

connection_string = 'my_connection_string'
blob_client = BlobClient.from_connection_string(connection_string, container_name, blob_name)
downloader = blob_client.download_blob(0)

with BytesIO() as f:
    downloader.readinto(f)
    weights = dill.loads(f)

Which returns:

>>> TypeError: a bytes-like object is required, not '_io.BytesIO'

I'm not sure how the approach using Pickle would be. How could this be solved?


Solution

  • Here is how this problem was solved:

    def get_weights_blob(blob_name):
        connection_string = 'my_connection_string'
        blob_client = BlobClient.from_connection_string(connection_string, container_name, blob_name)
        downloader = blob_client.download_blob(0)
    
        # Load to pickle
        b = downloader.readall()
        weights = pickle.loads(b)
    
        return weights
    

    And then retrieving weights by using the function:

    weights = get_weights_blob(blob_name = 'myPickleFile')