Search code examples

How to get better compression performance from external files for Amazon S3?

The 3 entries below are from a report. How should I handle compression performance for these files for Amazon S3? I know how to do gzip for S3. But the three files below present a more restrictive situation.

I don't have access to mailchimp's css file. Is there some way to get better compression performance in this case?

I periodically update my Thesis theme, which will change the css.css file shown below. I can't version that file since I need to use the name css.css. Is there some technique to handle this scenario?

Compressing could save 20.5KiB (79% reduction)

Compressing could save 1.1KiB (60% reduction)

Compressing could save 374B (48% reduction


  • Yeah, this is a pretty common question. If you serve static files from a traditional HTTP daemon like Apache, the content is actually compressed on-the-fly via mod_deflate--it transparently gzip's the file and sets the appropriate Content-Encoding header.

    If you want to do this off of S3, you have to manually gzip the files before uploading them (normally named something like cool-stylesheet.gz.css) and then set a custom Content-Encoding property on the S3 object like this:

    enter image description here

    This can be tedious to do by hand, so we actually do it automatically as part of our continuous integration build process. A post-commit hook in our source control fires, executing several build steps (including this one), and then the resulting files are deployed to the proper environment.


    It seems that you meant to describe a problem with Cloudfront, not S3. Since Cloudfront is a CDN and it caches files at it's edge locations, you have to force it to refetch the latest version of a file when it changes. There are two ways to do this: invalidate the cache or use filename versioning.

    Invalidating the cache is slow and can get really expensive. After the first 1,000 invalidation requests per month, it costs a nickel for every 10 files invalidated thereafter.

    A better option is to version the filenames by appending a unique identifier before they are pulled into Cloudfront. We typically use the Unix epoch of when the file was last updated. So cool-stylesheet.gz.css becomes cool-stylesheet_1363872846.gz.css. In the HTML document you then reference it like normal: <link rel="stylesheet" type="text/css" href="cool-stylesheet_1363872846.gz.css"> This will cause Cloudfront to refetch the updated file from your origin when a user opens that updated HTML document.

    As I mentioned above regarding S3, this is too is a tedious thing to do manually: You'd have to rename all of your files and search/replace all references to them in the source HTML documents. It makes more sense to make it part of your CI build process. If you're not using a CI server though, you might be able to do this with a commit hook in your source repository.