For testing purposes, I have this in my Apache configuration:
<Directory "/home/http">
...
<FilesMatch "\.(html|htm)$">
Header unset Etag
Header set Cache-control "max-age=0, no-cache"
</FilesMatch>
<FilesMatch "\.(jpg|jpeg|gif|png|js|css)$">
Header unset Etag
Header set Cache-control "public, max-age=10"
</FilesMatch>
</Directory>
This basically says to set static assets to have a cache that lasts 10 seconds. Again this is for testing and demonstration purposes.
I test it out by navigating directly to the file
$ wget -O - --save-headers localhost/mod_pagespeed_example/images/Puzzle.jpg
Cache-control: public, max-age=10
which works fine. But then I try to load the page with mod_pagespeed and extend_cache enabled
$wget -O - --save-headers localhost/mod_pagespeed_example/extend_cache.html?ModPagespeed=on&ModPagespeedFilters=extend_cache
<img src="images/Puzzle.jpg"/>
$wget -O - --save-headers localhost/mod_pagespeed_example/extend_cache.html?ModPagespeed=on&ModPagespeedFilters=extend_cache
<img src="http://localhost/mod_pagespeed_example/images/xPuzzle.jpg.pagespeed.ic.hgbHsZe0IN.jpg"/>
This is all fine and dandy. The initial request doesn't work because it needs to load the info into the cache, but from there it correctly replaces the src of the img tag with the cached, hashed version.
However, this only persists UNTIL max-age. So, if I have it set to 10 seconds, it will continue to point to http://localhost/mod_pagespeed_example/images/xPuzzle.jpg.pagespeed.ic.hgbHsZe0IN.jpg
, but then it will revert to images/Puzzle.jpg
again after 10 seconds, at which time it will go back to the cached version.
Is this expected behavior? I would think that pagespeed would check the hash after max-age, and if it's the same it wouldn't bother changing it back to the original value, but instead continue serving the cached file.
This is somewhat concerning. If I set max-age to something more useful, say 60 minutes, that will allow me to continue to update these asset files and assure that my updates are seen in a timely manner. However, if the site is visited once per day by users, then that is more than the max-age and they will always be served the original file rather than the cached version.
This is expected behavior. As you mentioned, the reason is that the resource has expired in cache and so we need to re-check it to make sure it is still the same. We do not want to block the user request while we check all the sub-resources.
Note, one solution to this would be to use ModPagespeedLoadFromFile. This will check the file's last modified time on disk and so can check even if the resource expired in cache.