Search code examples
pythonazure-blob-storageazure-sdk-python

Write zip to Blob storage Azure


I'm trying to zip files present in container 'input' and move them to container 'output'.
I'm using python SDK

# connection to blob storage via Azure Python SDK
connection_string = "myConnectionString"

blob_service_client = BlobServiceClient.from_connection_string(connection_string)

# get container client
input_container = blob_service_client.get_container_client(container="input")

# filename
filename = "document_to_zip.pdf"

# init zip object
zip_filename = "document_zipped.zip"
zip_object = ZipFile(zip_filename, "w")

data = input_container.download_blob(filename).readall()
zip_object.write(data)

# upload blob to results container as .zip file
results_blob = blob_service_client.get_blob_client(container="output",blob=zip_filename)
results_blob.upload_blob(zip_object, overwrite=True)

Get the following error :
Exception: ValueError: stat: embedded null character in path.
More general question : do you think my approach is fine regarding ziping and moving blob from one container to another ?

Thanks


Solution

  • In general, this error occurs when path contains '/' or ' \' in it. Meanwhile I could able to resolve it by removing the zip_object.write(data) line. Also keep in mind that the above-mentioned code works only for a single file in input container with an unsupported content which throws an error when downloaded.

    The below code works but gives error when downloaded

    from azure.storage.blob import BlobServiceClient
    from zipfile import ZipFile
    
    # connection to blob storage via Azure Python SDK
    connection_string = "<YOUR_CONNECTION_STRING>"
    
    blob_service_client = BlobServiceClient.from_connection_string(connection_string)
    
    # get container client
    input_container = blob_service_client.get_container_client(container="input")
    
    # filename
    filename = "document_to_zip.pdf"
    
    # init zip object
    zip_filename = "document_zipped.zip"
    zip_object = ZipFile(zip_filename, "w")
    
    data = input_container.download_blob(filename).readall()
    
    # upload blob to results container as .zip file
    results_blob = blob_service_client.get_blob_client(container="output",blob=zip_filename)
    results_blob.upload_blob(zip_object, overwrite=True)
    

    RESULTS:

    enter image description here


    Meanwhile you can save a group of files by looping inside the input container and zip them inside output container.

    from azure.storage.blob import BlobServiceClient
    from zipfile import ZipFile
    
    connection_string = "<Your_CONNECTION_STRING>"
    
    blob_service_client = BlobServiceClient.from_connection_string(connection_string)
    
    input_container = blob_service_client.get_container_client(container="input")
    
    generator = input_container.list_blobs()
    for blob in generator:    
        data = input_container.download_blob(blob.name).readall()
        results_blob = blob_service_client.get_blob_client(container="output"+"/"+"ZipFolder.zip",blob=blob.name)
        results_blob.upload_blob(data, overwrite=True)
    

    RESULTS:

    enter image description here

    enter image description here