Search code examples
c#azureblobstorageblock

My extension method can get better? BlockBlobClient GetBlockById dotnet


I've made an extension method for the BlockBlobClient, to get a specific block by block id, and I want to know if this code snippet can be tweaked to increase performance/other things

public static async Task<T> GetBlockByIdAsync<T>(this BlockBlobClient blockBlobClient, string blockId, CancellationToken cancellationToken)
{
    var blockListResponse = await blockBlobClient.GetBlockListAsync(cancellationToken: cancellationToken);
    var blockList = blockListResponse.Value.CommittedBlocks.ToList();
            
    var currentBlock = blockList.FirstOrDefault(a => a.Name == blockId);

    if (currentBlock.Name == null)
    {
        throw new InvalidOperationException($"Could not find BlockId {blockId}");
    }
            
    var length = currentBlock.SizeLong;
    var index = blockList.FindIndex(a => a.Name == blockId);

    var offset = 0L;
    for (var i = 0; i < index; i++)
    {
        offset += blockList[i].SizeLong;
    }

    var options = new BlobDownloadOptions()
    {
        Range = new HttpRange(offset, length)
    };

    var blockInfo = await blockBlobClient.DownloadStreamingAsync(options, cancellationToken);

    return JsonSerializer.Deserialize<T>(blockInfo.Value.Content);
}

Solution

  • The only thing that really stands out is the multiple iteration (3 times) of blockList.

    1. blockList.FirstOrDefault(a => a.Name == blockId)
    2. blockList.FindIndex(a => a.Name == blockId)
    3. for (var i = 0; i < index; i++)

    You can do this with a single iteration, something like this.

    public static async Task<T> GetBlockByIdAsync<T>(this BlockBlobClient blockBlobClient, string blockId, CancellationToken cancellationToken)
    {
        var blockListResponse = await blockBlobClient.GetBlockListAsync(cancellationToken: cancellationToken);
        var blockList = blockListResponse.Value.CommittedBlocks.ToList();
    
        var length = 0L;
        var offset = 0L;
    
        // iterate over all blocks until we find the block we want.
        foreach (var block in blockList){
            if (block.Name == blockId){
                length = block.SizeLong;
                break;
            }
    
            // We haven't found the block we want yet so update the offset.
            offset += block.SizeLong;
        }
    
        // Check if we found the block we were looking for.
        if (length == 0)
        {
            throw new InvalidOperationException($"Could not find BlockId {blockId}");
        }
                
        var options = new BlobDownloadOptions()
        {
            Range = new HttpRange(offset, length)
        };
    
        var blockInfo = await blockBlobClient.DownloadStreamingAsync(options, cancellationToken);
    
        return JsonSerializer.Deserialize<T>(blockInfo.Value.Content);
    }
    

    CAVEAT I haven't run this code locally or even tried to compile it so there may be errors but the idea is sound.

    UPDATE There was an iteration that I missed: blockListResponse.Value.CommittedBlocks.ToList();

    Granted, calling to list prevents any repeated calls to CommittedBlocks but this could be even more efficient.

    Assuming blockListResponse.Value.CommittedBlocks returns IEnumerable` you could even remove this...

    public static async Task<T> GetBlockByIdAsync<T>(this BlockBlobClient blockBlobClient, string blockId, CancellationToken cancellationToken)
    {
        var blockListResponse = await blockBlobClient.GetBlockListAsync(cancellationToken: cancellationToken);
        var blockList = blockListResponse.Value.CommittedBlocks;
    
        var length = 0L;
        var offset = 0L;
    
        // iterate over all blocks until we find the block we want.
        var blockListEnumerator = blockList.GetEnumerator();
        while (blockListEnumerator.MoveNext())
        {
            var block = blockListEnumerator.Current as BlobBlock;
    
            if (block.Name == blockId){
                length = block.SizeLong;
                break;
            }
    
            // We haven't found the block we want yet so update the offset.
            offset += block.SizeLong;
        }
    
        // Check if we found the block we were looking for.
        if (length == 0)
        {
            throw new InvalidOperationException($"Could not find BlockId {blockId}");
        }
                
        var options = new BlobDownloadOptions()
        {
            Range = new HttpRange(offset, length)
        };
    
        var blockInfo = await blockBlobClient.DownloadStreamingAsync(options, cancellationToken);
    
        return JsonSerializer.Deserialize<T>(blockInfo.Value.Content);
    }