Search code examples
pagespeedyslow

Why does google pagespeed asks to specify ETag even when cache headers are set


I have set cache headers to be far in future (1 year from now) and have disabled the ETags as advised by the YSlow (http://developer.yahoo.com/performance/rules.html#etags) but Google pagespeed seems to require ETag (or last-modified) even after the cached headers are set.

"It is important to specify one of Expires or Cache-Control max-age, and one of Last-Modified or ETag, for all cacheable resources."

The two rules seems to be conflicting each other.


Solution

  • YSlow does not advise to remove ETags in general but for some environments. When not using ETags then you should use Last-Modified instead.

    ETag and Last-Modified are for conditional GET-Requests when re-requesting an already cached and maybe expired resource.

    Cache-Control max-age is for defining how long a cached item is valid for sure without asking again. (When expired by this rule then the browser will make a conditional GET ...)

    So in your case:

    • Browser is caching the resource for one year. Within that year no request for this resource is done at all. It's directly served from local cache. (uses Cache-Control header settings.)
    • Browser does conditional Request after one year expired to check if something changed. The server responds with HTTP 304 and empty body when nothing changed. The browser continues to use its cached item in that case without the need of retransmission. (uses ETag and/or Last-Modified header settings)

    (The browser may or may not respect your data. For example it is possible that a browser will do a conditional request even when one year has not been expired yet.)

    For highly optimized sites the Cache-Control is far more important, because you set it faaaar future expire headers and simply change the URL for the resource in case it changed. While this prevents the use of conditional Requests it gives you the ability to be extremly aggressive when defining the expires header while being able to serve new versions of the resource immediatly to everybody at the same time. This is because of the new URL it seems to be a new resource in browser's view.

    For Java there exists a framework called jawr which makes use of these and other concepts without having negative impact to your site development.