Search code examples
c#azureazure-storageazure-blob-storage

Travering through specific Azure Blob Storage directory and deleting all files


The goal: Delete all files within a specific directory, including files within nested folders.

The problem: Deleting the directory itself does not work as this returns an error:

Exception: The specified blob does not exist

My Azure blob storage structure could look like this:

AzureFileStorageAccount
  AzureContainerName
    /themes
      /irrelevantstuff
    /images
      /a
        1.jpg
      /b
        /thumb
          1thumb.png
      /c
        4.jpg
      6.jpg
      9.jpg
      10.jpg

I don't know what any of the folder names are, but for the end result, I want to get a stack/list of all of the actual files found within a given directory.

For example, taking the images directory:

/images/a/1.jpg
/images/b/thumb/1thumb.png
/images/c/4.jpg
/images/6.jpg
/images/9.jpg
/images/10.jpg

To then, delete all of them.


Here's my attempted solution..

LoadInitialDirectory function:

public static void LoadInitialDirectory() {
    string initialDirectory = "images";

    CloudStorageAccount storageAccount = CloudStorageAccount.Parse(azureFileStorageAccount);
    CloudBlobClient client = storageAccount.CreateCloudBlobClient();
    CloudBlobContainer container = client.GetContainerReference(azureContainerName);
    CloudBlobDirectory directory = container.GetDirectoryReference(initialDirectory);

    var blobs = await directory.ListBlobSegmentedAsync(false, BlobListingDetails.Metadata, 350, null, null, null);
    foreach(var blob in blobs.Results)
    {
        var b = new CloudBlob(blob.Uri);
        CloudBlockBlob blockBlob = container.GetBlockBlobReference(b.Name);
        if (blockBlob.Exists()) {
            // I will assume this is a file
            ProcessFile(blockBlob.Uri);
        }
        else {
            // This is another directory
            ProcessDirectory(blockBlob.Uri);
        }
    }
}

ProcessDirectory function:

public static void ProcessDirectory(string innerDirectory) {
    CloudStorageAccount storageAccount = CloudStorageAccount.Parse(azureFileStorageAccount);
    CloudBlobClient client = storageAccount.CreateCloudBlobClient();
    CloudBlobContainer container = client.GetContainerReference(azureContainerName);
    CloudBlobDirectory directory = container.GetDirectoryReference(innerDirectory);

    var blobs = await directory.ListBlobSegmentedAsync(false, BlobListingDetails.Metadata, 350, null, null, null);
    foreach(var blob in blobs.Results)
    {
        var b = new CloudBlob(blob.Uri);
        CloudBlockBlob blockBlob = container.GetBlockBlobReference(b.Name);
        if (blockBlob.Exists()) {
            ProcessFile(blockBlob.Uri);
        }
        else {
            ProcessDirectory(blockBlob.Uri);
        }
    }
}

ProcessFile function:

public static void ProcessFile(string innerDirectory) {
    myStack.push(innerDirectory);
}

At the end of this, I should have a stack of blob Uri strings that I can iterate through and delete by the DeleteAsync method, therefore deleting the initial directory.

This seems to be overkill. Does anyone have any ideas for more compact, straightforward solutions?


Solution

  • There is one thing you should note: in blob storage, the directory(and sub-directory) is actually treated as part of the blob name. If you delete all the blobs inside a directory, the directory will be deleted automatically.

    The way to delete all the blobs inside a directory(and it's sub-directories) is that list all the blobs, then delete the blob one by one.

    Assume you're using this blob storage package Microsoft.Azure.Storage.Blob, version 11.1.3, then you can use blobDirectory.ListBlobs() method and set the parameter useFlatBlobListing as true which let you iterate through all the blobs within a specified directory(and as well as within the sub-directories).

    The sample code is as below, and works for me:

            var conn_str = "DefaultEndpointsProtocol=https;AccountName=xxx;AccountKey=xxxxxx;EndpointSuffix=core.windows.net";
            var myContainer = "aaa";
            var myDirectory = "images";
    
            CloudStorageAccount storageAccount = CloudStorageAccount.Parse(conn_str);
            CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
            CloudBlobContainer blobContainer = blobClient.GetContainerReference(myContainer);
            CloudBlobDirectory blobDirectory = blobContainer.GetDirectoryReference(myDirectory);
    
            //set useFlatBlobListing as true, so you can list all the blobs in the directory(and it's sub-directories)
            var blobs = blobDirectory.ListBlobs(useFlatBlobListing: true);
    
            //iterate through all the blobs in the specified directory(and it's sub-directories)
            foreach (var myblob in blobs)
            {
                var b = (CloudBlockBlob)myblob;
    
                //print out some properties of the blob, just for testing purpose.
                Console.WriteLine(b.Name);
                Console.WriteLine(b.Uri);
                Console.WriteLine("***********");
    
                //delete the blob
                b.Delete();
            }