Search code examples
cachingproxyhttp-headerscache-control

How to cache locally in browser when s-maxage > max-age


In short:

Let’s say s-maxage is one day and max-age is one hour. The proxy cache will keep a resource for a day, but after a few hours the Age header will be more than one hour. The browser sees the resource is older than one hour and won’t cache it locally. How to cache it locally in the browser regardless?


I'm trying to combine Cache-Control: max-age and s-maxage with sensible values. But setting s-maxage > max-age doesn't seem to make sense. Eventually the browser will always revalidate resources and will skip local browser cache, because resources received from proxy cache will immediately be stale (Age > max-age).

Goals:

  • Reduce load on origin: set long TTL at edge cache (s-maxage), this can be purged when necessary.
  • Speed up browser session: set small cache on browser (max-age).

Problem:

A resource stays in proxy cache for a long time (s-maxage), while its Age increases (time since fetch from origin). Eventually the Age of the resource in cache will be larger than max-age. When that happens the browser will revalidate every time it needs a resource, since the resource is stale on every request.

For example: Cache-Control: max-age=60, s-maxage=86400. The browser should keep a resource for 60 seconds. The proxy cache keeps a resource for a day.

t=0
browser: need resource
proxy cache: fetch resource from origin
-> cache returns fresh resource with Age: 0 / Cache-Control: "max-age=60, s-max-age=86400"

t=30
browser: locally cached resource is still fresh: Age (0+30) < max-age (60)

t=70
browser: local cache is stale: Age (0+70) > max-age (60) -> revalidate
proxy cache: cached resource is still fresh: Age (70) < s-maxage (86400)
-> cache returns resource with Age: 70, Cache-Control: "max-age=60, s-maxage=86400"

The cache returned a resource with an Age (70) that is larger than max-age (60). From now on every time the browser wants the resource it will be stale locally and needs revalidation.

t=75
browser: local cache is stale: Age (70+5) > max-age (60) -> revalidate
proxy cache: cached resource is still fresh: Age (75) < s-maxage (86400)
-> cache returns resource with Age: 75 / Cache-Control: "max-age=60, s-max-age=86400"

This means that if a resource is in proxy cache for longer than max-age the browser will always revalidate. The max-age value is only useful for max-age seconds after getting a fresh resource from origin.

  1. Is this expected behavior?
  2. How can I adjust this such that the browser will always cache a resource for 60s after requesting it, even if that received resource has been in proxy cache for a long time? (Or is this bad practice?)

Solution

  • Setting s-maxage > max-age doesn't seem to make sense.

    Your analysis looks correct, and I think I agree. Specifically, given a particular value of s-maxage it’s hard to see any reason why you’d want to use a smaller max-age, since that will result in pointless conditional validation requests.

    Note that there are reasonable use cases for setting max-age > s-maxage, so it still makes sense for the protocol to define these as two separate directives.

    How can I force the browser to cache a resource for a specific amount of time after receiving it, regardless of its freshness (as indicated by the Age header)?

    HTTP caching is based on the age of the resource, not when a response happened to be received. So there’s no way to force the user’s browser or the proxy cache to do this.

    But my cache plan is perfectly reasonable: I want to set max-age to a small value that accurately represents the degree of staleness I can tolerate, but set s-maxage to a long value and simply invalidate the proxy cache when the resource changes. Why won’t the HTTP specification support that?

    The fundamental issue here is that your—indeed, perfectly reasonable—cache scheme relies on having control over the proxy cache (to force invalidation), whereas the internet architecture defined by HTTP is based on independent actors that the origin server doesn't have control over. What you’re describing as a proxy cache is better thought of as a managed cache. MDN has a useful discussion of this:

    Shared caches can be further sub-classified into proxy caches and managed caches.... Managed caches are explicitly deployed by service developers to offload the origin server and to deliver content efficiently....

    In most cases, you can control the [managed] cache's behavior through the Cache-Control header and your own configuration files or dashboards. For example, the HTTP Caching specification essentially does not define a way to explicitly delete a cache—but with a managed cache, the stored response can be deleted at any time through dashboard operations, API calls, restarts, and so on. That allows for a more proactive caching strategy.

    So if you’re using a managed cache (Cloudfront, etc.) you don’t need to use s-maxage at all to manage the cache. You can directly control the settings outside of the HTTP protocol.

    Even if I use settings to control the managed cache retention period, I still face the issue of the Age header forcing the browser to revalidate.

    In the same way that the managed cache exposes TTL settings, it could also expose the ability to send a fresh Age header on each response. Whether any particular managed cache solution does that, I don’t know.