Search code examples
pythonazureazure-storage

How to list all blobs inside of a specific subdirectory in Azure Cloud Storage using Python?


I worked through the example code from the Azure docs https://learn.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-python

from azure.storage.blob import BlockBlobService
account_name = "x"
account_key = "x"
top_level_container_name = "top_container"

blob_service = BlockBlobService(account_name, account_key)

print("\nList blobs in the container")
generator = blob_service.list_blobs(top_level_container_name)
for blob in generator:
    print("\t Blob name: " + blob.name)

Now I would like to know how to get more fine grained in my container walking. My container top_level_container_name has several subdirectories

  • top_level_container_name/dir1
  • top_level_container_name/dir2
  • etc in that pattern

I would like to be able to list all of the blobs that are inside just one of those directories. For instance

  • dir1/a.jpg
  • dir1/b.jpg
  • etc

How do I get a generator of just the contents of dir1 without having to walk all of the other dirs? (I would also take a list or dictionary)

I tried adding /dir1 to the name of the top_level_container_name so it would be top_level_container_name = "top_container/dir1" but that didn't work. I get back an error code azure.common.AzureHttpError: The requested URI does not represent any resource on the server. ErrorCode: InvalidUri

The docs do not seem to even have any info on BlockBlobService.list_blobs() https://learn.microsoft.com/en-us/python/api/azure.storage.blob.blockblobservice.blockblobservice?view=azure-python

Update: list_blobs() comes from https://github.com/Azure/azure-storage-python/blob/ff51954d1b9d11cd7ecd19143c1c0652ef1239cb/azure-storage-blob/azure/storage/blob/baseblobservice.py#L1202


Solution

  • Please try something like:

    generator = blob_service.list_blobs(top_level_container_name, prefix="dir1/")
    

    This should list blobs and folders in dir1 virtual directory.

    If you want to list all blobs inside dir1 virtual directory, please try something like:

    generator = blob_service.list_blobs(top_level_container_name, prefix="dir1/", delimiter="")
    

    For more information, please see this link.