Search code examples
c#azureazure-storageazure-blob-storage

Is batch deletion a better way to perform a directory delete in Azure storage?


I am wondering what is the most efficient way to delete a directory (Or just a batch of blobs for that matter - I am aware of the fact that Azure storage do not have the consept of directories or folders) using c#. Right now I'm using parallel deleting (using Parallel.ForEach) of all the blobs in a folder - takes about a minute for 420 blobs that are summing up to 11 MB.

The code looks something like:

Parallel.ForEach(urisToDelete.Distinct(), uri => { 
    var blobReference = await this.cloudBlobClient.GetBlobReferenceFromServerAsync(uri);
    await blobReference.DeleteAsync();
});

I am trying to optimize this process and I came across this Microsoft's ducumentation of batch deleting. It's a bit complicated to change it in my project for performance testing. Does anyone know if the performance is better using this method of deletion? Does anyone know a better method?

Thanks a lot!


Solution

  • Yes, the batch deletion via batch client provides a higher performance than using Parallel.ForEach(xxx).

    Since the batch delete will batch multiple Azure Blob Storage delete operations in a single request, which indeed has a higher performance. But when using Parallel.ForEach, it will send many requests and result in a lower performance.

    But you should know the limitation when use it, like one batch only supports up to 256 subrequests etc. You can read this article for more details about it.

    As far as I know, this is the best way in performance to delete multi-blob-files.