Search code examples
amazon-web-servicesamazon-s3aws-sdknodes

(using aws-sdk) How to find total size of a folder stored in amazon s3


I wanted to know the total size of a folder stored in S3 using AWS-SDK.

Note:-

I don't want to use any command or AWS console to find the size of my folder I wanted to do this by aws-sdk and I mentioned it above so please don't mark this as duplicate.

so far what I found on the internet is to list down all the objects of folder and iterate throw it and i do this and it's working fine. here is my code :-

import AWS from 'aws-sdk';

AWS.config.region = "BUCKET_REGION";
AWS.config.credentials = new AWS.CognitoIdentityCredentials({
   IdentityPoolId: "COGNITO_ID",
});
let bucketName = "BUCKET_NAME"
let bucket = new AWS.S3({
                params: {
                    Bucket: bucketName
                }
             });

 bucket.listObjects({Prefix:"FOLDER_NAME",Bucket:"BUCKET_NAME"}, function (err, data) {
        if (err) {
           console.log(err)                        
        } else {
           console.log(data)
           //data returns the array throw which I iterate and find the total size of the object
        }
  });

but what is the problem is that there is a point of time when my folder contains so many objects that it makes it hard to iterate each one of the elements in the list. it takes to much time to just calculate the size of the folder.

so I need a better way to calculate the size of folder and all I found is this command

aws s3 ls s3://myBucket/level1/level2/ --recursive --summarize | awk 'BEGIN{ FS= " "} /Total Size/ {print $3}'

is there any way I can do the above process throw my aws-sdk.

any kind of help is appreciated. thanks in advance


Solution

  • It appears that your situation is:

    • You want to know the size of an Amazon S3 bucket on a regular basis
    • The bucket contains a large number of objects, which takes too much time

    Rather than listing objects and calculating sizes, I would recommend two alternatives:

    Amazon S3 Inventory

    Amazon S3 Inventory can provide a daily CSV file with details of all objects in a bucket. You could then take this data and calculate the total.

    Amazon CloudWatch bucket metrics

    Amazon CloudWatch has several metrics related to Amazon S3 buckets:

    • BucketSizeBytes
    • NumberOfObjects

    I'm not sure how often those metrics are updated (they are not instant), but BucketSizeBytes seems like it would be ideal for you.

    If all else fails...

    If the above two options do not meet your needs (eg you need to know the metrics "right now"), the remaining option would be to maintain your own database of objects. The database would need to be updated whenever an object is added or removed from the bucket (which can be done by using Amazon S3 Events to trigger an AWS Lambda function). You could then consult your own database to have the information available rather quickly.