Search code examples
amazon-web-servicesamazon-s3gzipbrotlizstd

Compress billions of files in S3 bucket


We have lots of files in S3 (>1B), I'd like to compress those to reduce storage costs. What would be a simple and efficient way to do this?

Thank you

Alex


Solution

  • Amazon S3 cannot compress your data.

    You would need to write a program to run on an Amazon EC2 instance that would:

    • Download the objects
    • Compress them
    • Upload the files back to S3

    An alternative is to use Storage Classes:

    • If the data is infrequently accessed, use S3 Standard - Infrequent Access -- this is available immediately and is cheaper as long as data is accessed less than once per month
    • Glacier is substantially cheaper but takes some time to restore (speed of restore is related to cost)