Search code examples
c#azureazure-storageazure-blob-storage

Need to Calculate SHA1 hash of file stored in Azure storage in c#


I am uploading large files (1-10 GB) to azure storage and need to calculate SHA1 hash value of files when uploaded. Am I able to calculate the SHA1 on the server, without having to download the file?


Solution

  • Azure Blob Storage support the MD5 hash calculation for blob automatically when putting blob, please see the content below of Get Blob Properties.

    Content-MD5

    If the Content-MD5 header has been set for the blob, this response header is returned so that the client can check for message content integrity. In version 2012-02-12 and newer, Put Blob sets a block blob’s MD5 value even when the Put Blob request doesn’t include an MD5 header.

    So it's not necessary to calculate SHA1 hash for a blob if not has special needs.

    As reference, here is a sample which calculate SHA1 hash without downloading for a blob stored in storage.

    Synchronous

    CloudStorageAccount storageAccount = CloudStorageAccount.Parse("<StorageAccountConnectionString>");
    CloudBlobClient     blobClient     = storageAccount.CreateCloudBlobClient();
    CloudBlobContainer  container      = blobClient.GetContainerReference("<container-name>");
    CloudBlob           blob           = container.GetBlobReference("<blob-name>");
    
    using(Stream blobStream = blob.OpenRead())
    {
        using (SHA1 sha1 = SHA1.Create())
        {
            byte[] checksum = sha1.ComputeHash(blobStream);
        }
    }
    

    Async:

    CloudStorageAccount storageAccount = CloudStorageAccount.Parse("<StorageAccountConnectionString>");
    CloudBlobClient     blobClient     = storageAccount.CreateCloudBlobClient();
    CloudBlobContainer  container      = blobClient.GetContainerReference("<container-name>");
    CloudBlob           blob           = container.GetBlobReference("<blob-name>");
    
    using(Stream blobStream = await blob.OpenReadAsync().ConfigureAwait(false))
    {
        using (SHA1 sha1 = SHA1.Create())
        {
            byte[] checksum = await sha1.ComputeHashAsync(blobStream);
        }
    }
    
    // ComputeHashAsync extension method from https://www.tabsoverspaces.com/233439-computehashasync-for-sha1
    public static async Task<Byte[]> ComputeHashAsync(this HashAlgorithm algo, Stream stream, Int32 bufferSize = 4096)
    {
        algo.Initialize();
    
        var buffer = new byte[bufferSize];
        var streamLength = inputStream.Length;
        while (true)
        {
            var read = await inputStream.ReadAsync(buffer, 0, buffer.Length).ConfigureAwait(false);
            if (inputStream.Position == streamLength)
            {
                algo.TransformFinalBlock(buffer, 0, read);
                break;
            }
            algo.TransformBlock(buffer, 0, read, default(byte[]), default(int));
        }
    
        return algo.Hash;
    }