Search code examples
google-cloud-storagegoogle-cloud-python

Google Cloud Storage: How to Delete a folder (recursively) in Python


I am trying to delete a folder in GCS and its all content (including sub-directories) with its Python library. Also I understand GCS doesn't really have folders (but prefix?) but I am wondering how I can do that?

I tested this code:

from google.cloud import storage

def delete_blob(bucket_name, blob_name):
    """Deletes a blob from the bucket."""
    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(blob_name)

    blob.delete()

delete_blob('mybucket', 'top_folder/sub_folder/test.txt')
delete_blob('mybucket', 'top_folder/sub_folder/')

The first call to delete_blob worked but not the 2nd one. What can I delete a folder recursively?


Solution

  • To delete everything starting with a certain prefix (for example, a directory name), you can iterate over a list:

    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)
    blobs = bucket.list_blobs(prefix='some/directory')
    for blob in blobs:
      blob.delete()
    

    Note that for very large buckets with millions or billions of objects, this may not be a very fast process. For that, you'll want to do something more complex, such as deleting in multiple threads or using lifecycle configuration rules to arrange for the objects to be deleted.