Search code examples
google-app-enginecachingwebchecksumblobstore

Web-caches modifying my CSS, JS and HTML files


We have a Smartphone App that downloads blobs from Google's blobstore and checksums them. Basically the blobs are immutable until replaced by a new version with a new filename, perfect for caching.

BUT: In rare circumstances (3 times in one month so far) the blobs lost some bytes. I verified some bytes in HEX mode and it appears that our precompressed JS, etc. files are piped through a page-speed processor before being cached (same size reduction each failure), and are not damaged but not verifiable by size/checksum anymore. Our compressor leaves 8 newlines more than its evil twin somewhere in the web.

Only one request for each file made it to our server logs, even when we tried multiple times.

So far I have not found that it is allowed by any cache spec to modify files that should be cached. Does anyone have information about such strange behavior?

Is it required to send no-cache headers to be able to checksum a HTML, JS or CSS file? We had no problems with mp3 and jpg content.


Solution

  • I have found the cause: At least one mobile phone company deploys a page-speed compressor in their mobile network. T-Mobile Austria and probably also in Germany modifies downloaded files by removing newline characters and other whitespace. The error reproduced on their network reliably but not on WiFi at the same time. The carrier ignores also our no-cache headers, the only fix would be to use https!

    However there were other times when this did not involve a mobile carrier, but mobile carriers account for most of our problems.