Search code examples
apachehttphttp-headerscache-controlhttp-caching

How to understand "semantically transparent" of RFC2616 in "Cache-Control machanism" section?


1.Definition of “semantically transparent” from sec "1.3 Terminology" of RFC2616

semantically transparent

  A cache behaves in a "semantically transparent" manner, with
  respect to a particular response, when its use affects neither the
  requesting client nor the origin server, except to improve
  performance. When a cache is semantically transparent, the client
  receives exactly the same response (except for hop-by-hop headers)
  that it would have received had its request been handled directly
  by the origin server.

2.I can not understand the sentence of RFC2616 "13.1.3 Cache-control Mechanisms"

The Cache-Control header allows a client or server to transmit a variety of directives in either requests or responses. These directives typically override the default caching algorithms. As a general rule, if there is any apparent conflict between header values, the most restrictive interpretation is applied (that is, the one that is most likely to preserve semantic transparency).

I am confusing those conflict values in "Cache-Control" header.

3.I test some examples via Apache web server

3.1 Web Toponology

Telnet(Client) <->HTTP proxy(apache work in proxy mode,S1) <->Web Server(Apache,S2)

3.1.1 S1 configurarion(work as caching proxy):

<Location />
    ProxyPass http://10.8.1.24:80/
</Location>

<IfModule mod_cache.c>

        <IfModule mod_mem_cache.c>
            CacheEnable mem /
            MCacheSize 4096
            MCacheMaxObjectCount 100
            MCacheMinObjectSize 1
            MCacheMaxObjectSize 2048
        </IfModule>

        CacheDefaultExpire 86400
</IfModule>

3.1.2 S2 configuration(work as real web server):

<filesMatch "\.(html|png)">
    Header set Cache-Control "max-age=5, max-age=15"
</filesMatch>

3.2 Test Cases

3.2.1 Two "max-age" values

GET /index.html HTTP/1.1
Host: haha
User-Agent: telnet


HTTP/1.1 200 OK
Date: Wed, 13 Mar 2013 03:40:25 GMT
Server: Apache/2.2.23 (Win32)
Last-Modified: Sat, 20 Nov 2004 20:16:24 GMT
ETag: "63e62-2c-3e9564c23b600"
Accept-Ranges: bytes
Content-Length: 44
Cache-Control: max-age=5, max-age=35, must-revalidate
Age: 3
Content-Type: text/html

<html><body><h1>It works!</h1></body></html>

Value "max-age=5" is applied. Here I think "max-age=35" is applied,since this value can store content longer in cache and server to subsequent requests for improving performance from concept "semantic transparency".

3.2.2 max-age=35 and must-revalidate

GET /index.html HTTP/1.1
Host: haha
User-Agent: telnet

HTTP/1.1 200 OK
Date: Wed, 13 Mar 2013 03:41:24 GMT
Server: Apache/2.2.23 (Win32)
Last-Modified: Sat, 20 Nov 2004 20:16:24 GMT
ETag: "63e62-2c-3e9564c23b600"
Accept-Ranges: bytes
Content-Length: 44
Cache-Control: max-age=35, must-revalidate
Age: 10
Content-Type: text/html

<html><body><h1>It works!</h1></body></html>

Value max-age=35 is applied. Here I think value "must-revalidate" shoud be applied.

3.2.3 max-age=35 and no-store

GET /index.html HTTP/1.1
Host: haha
User-Agent: telnet

HTTP/1.1 200 OK
Date: Wed, 13 Mar 2013 03:45:04 GMT
Server: Apache/2.2.24 (Unix)
Last-Modified: Sat, 20 Nov 2004 20:16:24 GMT
ETag: "63e62-2c-3e9564c23b600"
Accept-Ranges: bytes
Content-Length: 44
Cache-Control: max-age=35, no-store
Content-Type: text/html

<html><body><h1>It works!</h1></body></html>

Value "no-store" is applied.

3.2.4 max-age=36 and no-cache

GET /index.html HTTP/1.1
Host: haha
User-Agent: telnet

HTTP/1.1 200 OK
Date: Wed, 13 Mar 2013 06:22:14 GMT
Server: Apache/2.2.24 (Unix)
Last-Modified: Sat, 20 Nov 2004 20:16:24 GMT
ETag: "63e62-2c-3e9564c23b600"
Accept-Ranges: bytes
Content-Length: 44
Cache-Control: max-age=35, no-cache
Content-Type: text/html

<html><body><h1>It works!</h1></body></html>

Value "no-cache" is applied.

References: RFC2616 https://www.rfc-editor.org/rfc/rfc2616


Solution

  • I would interpret the examples you gave as follows:

    • max-age=5, max-age=15: max-age=5 wins, since it is a shorter cache time which is more restrictive

    • max-age=5, max-age=35, must-revalidate: must-revalidate wins, since it requires the client to always revalidate a request. And Section 14.9.4 says:

        The must-revalidate directive is necessary to support reliable
        operation for certain protocol features. In all circumstances an
        HTTP/1.1 cache MUST obey the must-revalidate directive;
      
    • max-age=35, no-store: no-store wins, since it basically means no caching should be performed, which is certainly the most restrictive.

    • max-age=35, no-cache: no-cache wins, since it is similar to no-store and doesn't specify any field names, meaning that caches must not reuse the response for a subsequent request, which is the more restrictive of the two.