Search code examples
powershellencodinghashmd5azure-blob-storage

Comparing string outputs between Azure Properties.ContentMD5 and Get-Filehash


How do I compare the output of Get-FileHash directly with the output of Properties.ContentMD5?


I'm putting together a PowerShell script that takes some local files from my system and copies them to an Azure Blob Storage Container.

The files change daily so I have added in a check to see if the file already exists in the container before uploading it.

I use Get-FileHash to read the local file:

$LocalFileHash = (Get-FileHash "D:\file.zip" -Algorithm MD5).Hash

Which results in $LocalFileHash holding this: 67BF2B6A3E6657054B4B86E137A12382

I use this code to get the checksum of the blob file already transferred to the container:

$BlobFile = "Path\To\file.zip"
$AZContext = New-AZStorageContext -StorageAccountName $StorageAccountName -SASToken "<token here>"

$RemoteBlobFile = Get-AzStorageBlob -Container $ContainerName -Context $AZContext -Blob $BlobFile -ErrorAction Ignore 
if ($ExistingBlobFile) { 
    $cloudblob = [Microsoft.Azure.Storage.Blob.CloudBlockBlob]$RemoteBlobFile.ICloudBlob
    $RemoteBlobHash = $cloudblob.Properties.ContentMD5
}

This value of $RemoteBlobHash is set to Z78raj5mVwVLS4bhN6Ejgg==

No problem, I thought, I'll just decrypt the Base64 string and compare:

$output = [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String($RemoteBlobHash))

Which gives me g�+j>fWKK��7�#� so not directly comparable ☹


This question shows someone in a similar pickle but I don't think they were using Get-FileHash given the format of their local MD5 result.

Other things I've tried:

  • changing the System.Text.Encoding line above UTF8 to UTF16 & ASCII which changes the output but not to anything recognisable.
  • dabbling with GetBytes to see if that helped:
$output = [System.Text.Encoding]::UTF8.GetBytes([System.Text.Encoding]::UTF16.GetString([System.Convert]::FromBase64String($RemoteBlobHash)))

Note: Using md5sum to compare the local file and a downloaded copy of file.zip results in the same MD5 string as Get-FileHash: 67BF2B6A3E6657054B4B86E137A12382

Thank you in advance!


Solution

  • ContentMD5 is a base64 representation of the binary hash value, not the resulting hex string :)

    $md5sum = [convert]::FromBase64String('Z78raj5mVwVLS4bhN6Ejgg==')
    $hdhash = [BitConverter]::ToString($md5sum).Replace('-','')
    

    Here we convert base64 -> binary -> hexadecimal


    If you need to do it the other way around (ie. for obtaining a local file hash, then using that to search for blobs in Azure), you'll first need to split the hexadecimal string into byte-size chunks, then convert the resulting byte array to base64:

    $hdhash = '67BF2B6A3E6657054B4B86E137A12382'
    $bytes  = [byte[]]::new($hdhash.Length / 2)
    for($i = 0; $i -lt $bytes.Length; $i++){
      $offset = $i * 2
      $bytes[$i] = [convert]::ToByte($hdhash.Substring($offset,2), 16)
    }
    $md5sum = [convert]::ToBase64String($bytes)