I need help with getting the folder names in a gcs bucket (i.e. emaildownloads/20230130/abc/xyz). The 'emaildownloads' is the bucket name, and i need to extract the date and the 'abc' folder.
def my_list_bucket(self, bucketName, delimiter='/'):
storage_client = self. storage_client.lookup_bucket(bucketName)
blobs = storage_client.list_blobs(prefix='20230130', delimiter='/')
print("Blobs:")
for blob in blobs:
print(blob.name)
if delimiter:
print("Prefixes:")
for prefix in blobs.prefixes:
print(prefix)
I was able to pull the just folder name, but I need it to be dynamic and I don't want to hard code the date in. I only need the substring between the bucket and the final forward slashes. 20230130/abc/ is all I need from the pathname.
If it is returning the full path, and you only need part of that path, you can just split the full path at the /
character and get the parts you want.
def my_list_bucket(self, bucketName, limit=sys.maxsize):
a_bucket = self.storage_client.lookup_bucket(bucketName)
bucket_iterator = a_bucket.list_blobs()
for resource in bucket_iterator:
path_parts = resource.name.split('/')
date_folder = path_parts[1]
abc_folder = path_parts[2]
desired_path = path_parts[1] + '/' + path_parts[2] +'/'
limit = limit - 1
if limit <= 0:
break
`