Search code examples
amazon-web-servicesamazon-s3aws-sdk

Quickly finding the size of an S3 'folder'


We have s3 'folders' (objects with a prefix under a bucket) with millions and millions of files and we want to figure out the size of these folders.

Writing my own .net application to get the lists of s3 objects was easy enough but the maximum number of keys per request is 1000, so it's taking forever.

Using S3Browser to look at a 'folder's' properties is taking a long time too. I'm guessing for the same reasons.

I've had this .NET application running for a week - I need a better solution.

Is there a faster way to do this?


Solution

  • I think the ideal solution does not exist. But I offer some ideas you can further develop:

    1. Is the app the only mean by which file are written to S3? If so, you can store (in a db, a file or what ever) the files size and sum it when necessary
    2. Do concurrent calls to the LIST api
    3. Can you switch from an organisation based on folders to one based on buckets? If so, you could query the billing API (yes, the billing) and calculating the size (or an approximation of) from cost...