Search code examples
objectgoogle-cloud-platformgoogle-cloud-storagebucket

How can I use wildcards in my gcp bucket objects path?


My main problem is, I want to check if an object in gcp exists or not. So, what I tried

from google.cloud import storage
client = storage.Client()
path_exists = False
for blob in client.list_blobs('models', prefix='trainedModels/mddeep256_sarim'):
    path_exists = True
    break

It worked fine for me. But now the problem is I don't know the model name which is mddeep256 but I know further part _sarim

So, I want to use something like

for blob in client.list_blobs('models', prefix='trainedModels/*_sarim'):

I want to use * wildcard, how can I do that?


Solution

  • list_blob doesn't support regex in prefix. you need filter by yourself as mentioned by Guilaume.

    following should work.

    def is_object_exist(bucket_name, object_pattern):
        from google.cloud import storage
        import re
        client = storage.Client()
        all_blobs = client.list_blobs(bucket_name)
        regex = re.compile(r'{}'.format(object_pattern))
        filtered_blobs = [b for b in all_blobs if regex.match(b.name)]
        return True if len(filtered_blobs) else False