Search code examples
azurehashazure-blob-storage

Content-MD5 Header calculation when using Azure DataBox


I am planning a file migration to Azure Blob Storage using Azure Databox. (For reference, I plan on using the SMB model instead of the NFS/Rest).

Is the Content-MD5 calculated in that scenario, or should I calculate locally and explicitly set the x-ms-blob-content-md5 blob property after the restore into the SA?

Thanks


Solution

  • When transferring files to Azure Blob Storage using the SMB model with Azure Databox, the Content-MD5 is not calculated during the transfer. As a result, it is recommended to calculate the Content-MD5 locally and set the x-ms-blob-content-md5 blob property explicitly after the restore into the storage account.

    You can use the below code to calculate the Content-MD5 locally and set the x-ms-blob-content-md5 blob property using the Azure Storage Python SDK.

    Code:

    from azure.storage.blob import BlobServiceClient,ContentSettings
    import hashlib
    
    blob_service_client = BlobServiceClient.from_connection_string("<Your connection string>")
    blob_client = blob_service_client.get_blob_client(container="sample",blob="vm1example.txt")
    with open(r"/path/to/local/file", "rb") as f:
        content = f.read()
        content_md5 = hashlib.md5(content).digest()
    content_settings = ContentSettings(content_md5=content_md5)
    blob_client.set_http_headers(content_settings=content_settings)
    
    

    Output:
    enter image description here

    Reference:
    Manage properties and metadata for a blob with Python - Azure Storage | Microsoft Learn