Search code examples
pythongoogle-cloud-storagegsutil

How to get list_blobs to behave like gsutil


I would like to only get the first level of a fake folder structure on GCS.

If I run e.g.:

gsutil ls 'gs://gcp-public-data-sentinel-2/tiles/' I get a list like this: gs://gcp-public-data-sentinel-2/tiles/01/ gs://gcp-public-data-sentinel-2/tiles/02/ gs://gcp-public-data-sentinel-2/tiles/03/ gs://gcp-public-data-sentinel-2/tiles/04/ gs://gcp-public-data-sentinel-2/tiles/05/ gs://gcp-public-data-sentinel-2/tiles/06/ gs://gcp-public-data-sentinel-2/tiles/07/ gs://gcp-public-data-sentinel-2/tiles/08/ gs://gcp-public-data-sentinel-2/tiles/09/ gs://gcp-public-data-sentinel-2/tiles/10/ gs://gcp-public-data-sentinel-2/tiles/11/ gs://gcp-public-data-sentinel-2/tiles/12/ gs://gcp-public-data-sentinel-2/tiles/13/ gs://gcp-public-data-sentinel-2/tiles/14/ gs://gcp-public-data-sentinel-2/tiles/15/ . . .

Running code like the following in the Python API give me an empty result:

from google.cloud import storage
bucket_name = 'gcp-public-data-sentinel-2'
prefix = 'tiles/'
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
for blob in bucket.list_blobs(max_results=10, prefix=prefix,
                              delimiter='/'):
    print blob.name

If I don't use the delimiter option I get all the results in the bucket which is not very useful.


Solution

  • Maybe not the best way, but, inspired by this comment on the official repository:

    iterator = bucket.list_blobs(delimiter='/', prefix=prefix)
    response = iterator._get_next_page_response()
    for prefix in response['prefixes']:
        print('gs://'+bucket_name+'/'+prefix)
    

    Gives:

    gs://gcp-public-data-sentinel-2/tiles/01/
    gs://gcp-public-data-sentinel-2/tiles/02/
    gs://gcp-public-data-sentinel-2/tiles/03/
    gs://gcp-public-data-sentinel-2/tiles/04/
    gs://gcp-public-data-sentinel-2/tiles/05/
    gs://gcp-public-data-sentinel-2/tiles/06/
    gs://gcp-public-data-sentinel-2/tiles/07/
    gs://gcp-public-data-sentinel-2/tiles/08/
    gs://gcp-public-data-sentinel-2/tiles/09/
    gs://gcp-public-data-sentinel-2/tiles/10/
    ...