Search code examples
djangogoogle-chromedjango-staticfileshttp-caching

Many HTTP 304 responses results in fewer GET requests


I have a Django development server hosting a web-page that real-time (ish) displays information gathered from numerous servers I watch over. This web-page is still in development, so I am currently using the built-in web host provided with Django, started on an Ubuntu host with:

python3 manage.py runserver IP:Port

On the same ubuntu host there is a python script continuously reaching out to the monitored servers and formatting the responses into a .html file which the client reloads within a <div> every minute. The general functionality of the page the client accesses is as follows:

<div id="status" style="width:100%; height: 1000"></div>
<script>
    $('#status').load("{% static 'alerts/status.html' %}");
    setInterval(function() {
        $('#status').load("{% static 'alerts/status.html' %}");
    }, 60000);
</script>

...so the page loads the status.html file within the division on page-load, and then reloads it every minute. This has been working great, however, I have noticed looking at the Django log, that if status.html has not changed after ten status 304 (Not Modified) responses, the time waited between requests begins to roll-off. That is to say, instead of waiting 1 minute, it waits 2 minutes, then 5 minutes, and so on (roughly, I forget the actual rate of roll-off).

Now the issue I'm facing is that my server went down over the weekend (unrelated), but the display screen I had the web-page up on stayed active, so it rolled off so much that it seems it has completely broken, refusing to download the latest status.html, even when I force Chrome to reload everything and not use the cache (ctrl + R or shift + F5).

I tried researching this roll-off but couldn't find any information on it. I assume this is something built into Google Chrome (the browser I'm using) to save bandwidth when the page is not changing but my status page is a couple kilobytes at most and the 304 responses are already saving the little bandwidth that is so if there's a way to completely disable this roll-off for production that would be ideal.

In any case, any information on why I'm seeing this behavior / where it's coming from would be much appreciated as I can't seem to find any documentation on it. The closest thing I found was from Google's developer documentation on caching here. It mentions the ability to define maximum-age and no-cache behavior, so I could force the client to redownload status.html every minute, but this seems messy. While that would work in my specific scenario given status.html is a couple kilobytes at most, just disabling this roll-off behavior would do the trick and would keep unnecessary bandwidth down.


Solution

  • The problem here is that the response for status.html doesn't have an explicit cache expiration header. In the absence of such a header, the browser is free to use its own algorithm (such as the roll-off you're seeing) to choose an expiration time. From RFC 7234:

    Since origin servers do not always provide explicit expiration times, a cache MAY assign a heuristic expiration time when an explicit time is not specified.... This specification does not provide specific algorithms.

    So the solution is straightforward: assign an explicit cache expiration time.

    Implementing this solution is unfortunately not trivial using Django's staticfiles app. A better default for this app would be to not cache the results at all, but that solution was deferred pending a merger with whitenoise.

    Solutions include using a different server (like nginx); using a different app (like whitenoise); or using static views directly rather than the staticfiles app (see this question for a few approaches).