Search code examples
amazon-web-servicesamazon-s3static-site

`aws s3 cp` vs `aws s3 sync` behavior and cost


I have a static site that I am deploying to s3 and then using CloudFront to distribute to users. After I build the site, I want to push the new build to s3. I found 2 approaches to do that.

  • aws s3 cp --recursive ./public/ s3://bucket-name --cache-control 'public, max-age=300, s-maxage=31536000'

  • aws s3 sync --delete ./public/ s3://bucket-name --cache-control 'public, max-age=300, s-maxage=31536000'

I am planning to deploy once or twice every week.

I want to know which of these is less expensive (money)? To be more clear, I want to know which among these will cost me less in the long run?

I tried reading the docs, but I was not able to figure out the differences. Please help me with this.


Solution

  • One thing to note is that aws s3 cp --recursive and aws s3 sync --delete have different behaviors.

    aws s3 cp will copy all files, even if they already exist in the destination area. It also will not delete files from your destination if they are deleted from the source.

    aws s3 sync looks at the destination before copying files over and only copies over files that are new and updated. The --delete flag also will delete things at the destination if they were removed in source.

    The sync command is what you want as it is designed to handle keeping two folders in sync while copying the minimum amount of data. Sync should result in less data being pushed into S3 bucket so that should have a less cost overall.

    To give a counterexample, a use case where aws s3 cp outperforms and has lower cost than sync is if you just need to transfer files and you know all the files are new to the destination. This is more performant and lower cost because the code is not checking the destination if things exist before starting the transfer.

    Additional note: If your connection interrupt in middle of a cp operation, and you run cp again, the previous operations are basically going to be wasted. But if you interrupt in middle of sync command and run it again, that has most likely less cost and less waste.