Search code examples
c#azure-data-lake

How to set ContentMD5 in DataLakeFileClient?


When uploading to an Azure Data Lake using the Microsoft Azure Storage Explorer the file automatically generates and stores a value for the ContentMD5 property. It also automatically does it in a function app that uses a Blob binding.

However, this does not automatically generate when uploading from a C# DLL.

I want to use this value to compare files in the future.

My code for the upload is very simple.

DataLakeFileClient fileClient = await directoryClient.CreateFileAsync("testfile.txt");
await fileClient.UploadAsync(fileStream);

I also know I can generate an MD5 using the below code, but I'm not certain if this is the same way that Azure Storage Explorer does it.

using (var md5gen = MD5.Create())
{
    md5hash = md5gen.ComputeHash(fileStream);
}

but I have no idea how to set this value to the ContentMD5 property of the file.


Solution

  • I have found the solution.

    The UploadAsync method has an overload that accepts a parameter of type DataLakeFileUploadOptions. This class contains a HttpHeaders object which in turn has a ContentHash property which stores it as a property of the document.

    var uploadOptions = new DataLakeFileUploadOptions();
    uploadOptions.HttpHeaders = new PathHttpHeaders();
    uploadOptions.HttpHeaders.ContentHash = md5hash;
    
    await fileClient.UploadAsync(fileStream, uploadOptions);