Search code examples
wordpressamazon-web-servicesamazon-s3gzipamazon-cloudfront

How to get better compression performance from external files for Amazon S3?


The 3 entries below are from a gtmetrix.com report. How should I handle compression performance for these files for Amazon S3? I know how to do gzip for S3. But the three files below present a more restrictive situation.

I don't have access to mailchimp's css file. Is there some way to get better compression performance in this case?

I periodically update my Thesis theme, which will change the css.css file shown below. I can't version that file since I need to use the name css.css. Is there some technique to handle this scenario?

Compressing http://www.mysite.com/wp-content/thesis/skins/classic/css.css could save 20.5KiB (79% reduction)

Compressing http://cdn-images.mailchimp.com/embedcode/slim-041711.css could save 1.1KiB (60% reduction)

Compressing http://www.mysite.com/wp-includes/js/comment-reply.min.js?ver=3.5.1 could save 374B (48% reduction


Solution

  • Yeah, this is a pretty common question. If you serve static files from a traditional HTTP daemon like Apache, the content is actually compressed on-the-fly via mod_deflate--it transparently gzip's the file and sets the appropriate Content-Encoding header.

    If you want to do this off of S3, you have to manually gzip the files before uploading them (normally named something like cool-stylesheet.gz.css) and then set a custom Content-Encoding property on the S3 object like this:

    enter image description here

    This can be tedious to do by hand, so we actually do it automatically as part of our continuous integration build process. A post-commit hook in our source control fires, executing several build steps (including this one), and then the resulting files are deployed to the proper environment.

    Edit:

    It seems that you meant to describe a problem with Cloudfront, not S3. Since Cloudfront is a CDN and it caches files at it's edge locations, you have to force it to refetch the latest version of a file when it changes. There are two ways to do this: invalidate the cache or use filename versioning.

    Invalidating the cache is slow and can get really expensive. After the first 1,000 invalidation requests per month, it costs a nickel for every 10 files invalidated thereafter.

    A better option is to version the filenames by appending a unique identifier before they are pulled into Cloudfront. We typically use the Unix epoch of when the file was last updated. So cool-stylesheet.gz.css becomes cool-stylesheet_1363872846.gz.css. In the HTML document you then reference it like normal: <link rel="stylesheet" type="text/css" href="cool-stylesheet_1363872846.gz.css"> This will cause Cloudfront to refetch the updated file from your origin when a user opens that updated HTML document.

    As I mentioned above regarding S3, this is too is a tedious thing to do manually: You'd have to rename all of your files and search/replace all references to them in the source HTML documents. It makes more sense to make it part of your CI build process. If you're not using a CI server though, you might be able to do this with a commit hook in your source repository.