Search code examples
pythongoogle-cloud-storageairflowblob

get size of blob storage GCP


I'm working with python and airflow(composer)

I have many csv files in a GCP bucket and I need to get the size of a particular file.

The file name comes from before. I am using the list_blobs function but I have to use a for to search for the file, isn't there a function to obtain the information of a particular blob?

 from google.cloud.storage import Blob
 from google.cloud import storage

 client = storage.Client()
 bucket = client.bucket('gs://bucket_name')
 desired_file = kwargs['csv_name']

  for blob in bucket.list_blobs():
    if desired_file== blob.name and blob.size > 0:
        print("Name: "+ blob.name +" Size blob obj: "+str(blob.size) + "bytes")

Solution

  • You can utilize the function bucket.get_blob('filename') to get a desired blob instead of looping through bucket.list_blobs().

    from google.cloud.storage import Blob
    from google.cloud import storage
    
    client = storage.Client()
    bucket = client.bucket('gs://bucket_name')
    desired_file = kwargs['csv_name']
    
    # new code
    blob = bucket.get_blob(desired_file)
    
    print("Name: "+ blob.name +" Size blob obj: "+str(blob.size) + "bytes")