Search code examples
azure-data-factoryazure-data-lake-gen2

ADF dataflow creates blobs with 0 commited blocks in ADLS gen 2


I'm creating files in ADLS Gen 2 account using ADF dataflow. The blob type is shown as "Block blob" in the storage UI. The file content looks good. However, while trying to read the blob programmatically, I get the committed blocks as 0. This issue is happening after upgrading the blob storage to ADLS Gen 2. Not sure if this is ADF issue or ADLS gen 2 issue.

Sample code

string connectionString = "";
string containerName = "test";
string blobName = "test.json";

BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);

BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(containerName);

BlockBlobClient blobClient = containerClient.GetBlockBlobClient(blobName);

var committedBlocks = await blobClient.GetBlockListAsync(BlockListTypes.All);

foreach (var block in committedBlocks.Value.CommittedBlocks)
{
    Console.WriteLine($"Block Name: {block.Name}");
}

Solution

  • After trying multiple things, only SAS auth from ADF to Gen 2 allows creation of blobs with committed blocks. This works only if you use storage blob linked service, not ADLS Gen 2 linked service. This looks like a bug or design flaw on ADF. This information should be publicly documented so that customers are aware that ADF-ADLS Gen2 integration is inefficient to process large blobs.