Search code examples
amazon-web-servicesamazon-s3boto3

S3 bucket size for subset bucket names


How can I use a custom list of S3 bucket names from local file as sometimes it takes too long for large buckets or different storage class. not sure sometimes it doesn't show all S3 buckets?

with open('subsetbucketslist.txt') as f:
    allbuckets = f.read().splitlines()

How to use local file of buckets names as input?

By default it would list all buckets:

import boto3

total_size = 0
s3=boto3.resource('s3')

for mybucket in s3.buckets.all():
    mybucket_size=sum([object.size for object in boto3.resource('s3').Bucket(mybucket.name).objects.all()])
    print (mybucket.name, mybucket_size)

Solution

  • If you want to calculate the size for particular buckets, then put those bucket names in your for loop:

    import boto3
    
    total_size = 0
    s3 = boto3.resource('s3')
    
    with open('subsetbucketslist.txt') as f:
        allbuckets = f.read().splitlines()
    
    for bucket_name in allbuckets:
        mybucket_size = sum([object.size for object in boto3.resource('s3').Bucket(bucket_name).objects.all()])
        print (bucket_name, mybucket_size)
    

    It's also worth mentioning that Amazon CloudWatch keeps track of bucket sizes (BucketSizeBytes). See: Metrics and dimensions - Amazon Simple Storage Service