Let's say i have to deliver dynamic content for my website that can be updated at anytime. I want to cache data to get the most optimal performances.
I think the most accurate thing to do is keeping data in the cache as long as the data hasn't changed. So it involves using no-cache
options.
The problem is i don't know if this is really a good choice. Each time someone will request these data, the cache server will request the origin server and ask if data has changed since then. So i don't know what is the difference with no-store
option where data are not cached at all (in term of performance). What i mean is that i think the bandwitdh volume will be used pretty in the same way (am i right ?)
So, if each single time a request is made, my cache server must check the source server so what is the difference with requesting directly the source server ?
Is there any option to set the cache in order to make it fetch new data if and only an update has really occured ?
ETags are one way to go. If you can associate a reasonable short, unique string with your dynamic content then you can use that identifier in the HTTP response headers as:
Cache-Control: public, must-revalidate, max-age=60
ETag: <content identifer>
It's important that the content identifier changes it the content changes, so for example using the hex encoded hash of the content would work. The client will then use the ETag by sending an In-None-Match header in a GET request:
If-None-Match: <previous etag>
The server should send a 304 Not Modified response if the content's ETag has not changed, or send a full 200 OK response with the updated content and a new ETag otherwise.
The other option to to support the If-Modified-Since header on the server. This header allows the client to request content only if it has been modified since the specified date, otherwise the server should return a 304 Not Modified response with no content.