Search code examples
djangosessiondjango-sessions

How to prevent Django from writing to django_session table for certain URLs


Apologies if my question is very similar to this one and my approach to trying to solve the issue is 100% based on the answers to that question but I think this is slightly more involved and may target a part of Django that I do not fully understand.


I have a CMS system written in Django 1.5 with a few APIs accessible by two desktop applications which cannot make use of cookies as a browser does.

I noticed that every time an API call is made by one of the applications (once every 3 seconds), a new entry is added to django_session table. Looking closely at this table and the code, I can see that all entries to a specific URL are given the same session_data value but a different session_key. This is probably because Django determines that when one of these calls is made from a cookie-less application, the request.session._session_key is None.

The result of this is that thousands of entries are created every day in django_session table and simply running ./manage clearsessions using a daily cron will not remove them from this table, making whole database quite large for no obvious benefit. Note that I even tried set_expiry(1) for these requests, but ./manage clearsessions still doesn't get rid of them.

To overcome this problem through Django, I've had to override 3 Django middlewares as I'm using SessionMiddleware, AuthenticationMiddleware and MessageMiddleware:

from django.contrib.sessions.middleware import SessionMiddleware
from django.contrib.auth.middleware import AuthenticationMiddleware
from django.contrib.messages.middleware import MessageMiddleware

class MySessionMiddleware(SessionMiddleware):
    def process_request(self, request):
        if ignore_these_requests(request):
            return
        super(MySessionMiddleware, self).process_request(request)

    def process_response(self, request, response):
        if ignore_these_requests(request):
            return response
        return super(MySessionMiddleware, self).process_response(request, response)

class MyAuthenticationMiddleware(AuthenticationMiddleware):
    def process_request(self, request):
        if ignore_these_requests(request):
            return
        super(MyAuthenticationMiddleware, self).process_request(request)

class MyMessageMiddleware(MessageMiddleware):
    def process_request(self, request):
        if ignore_these_requests(request):
            return
        super(MyMessageMiddleware, self).process_request(request)

def ignore_these_requests(request):
    if request.POST and request.path.startswith('/api/url1/'):
            return True
    elif request.path.startswith('/api/url2/'):
        return True
    return False

Although the above works, I can't stop thinking that I may have made this more complex that it really is and that this is not the most efficient approach as 4 extra checks are made for every single request.

Are there any better ways to do the above in Django? Any suggestions would be greatly appreciated.


Solution

  • Dirty hack: removing session object conditionally.

    One approach would be including a single middleware discarding the session object conditional to the request. It's a bit of a dirty hack for two reasons:

    • The Session object is created at first and removed later. (inefficient)
    • You're relying on the fact that the Session object isn't written to the database yet at that point. This may change in future Django versions (though not very likely).

    Create a custom middleware:

    class DiscardSessionForAPIMiddleware(object):
    
        def process_request(self, request):
            if request.path.startswith("/api/"): # Or any other condition
                del request.session
    

    Make sure you install this after the django.contrib.sessions.middleware.SessionMiddleware in the MIDDLEWARE_CLASSES tuple in your settings.py.

    Also check that settings.SESSION_SAVE_EVERY_REQUEST is set to False (the default). This makes it delay the write to the database until the data is modified.


    Alternatives (untested)

    • Use process_view instead of process_request in the custom middleware so you can check for the view instead of the request path. Advantage: condition check is better. Disadvantage: other middleware might already have done something with the session object and then this approach fails.
    • Create a custom decorator (or a shared base class) for your API views deleting the session object in there. Advantage: responsibility for doing this will be with the views, the place where you probably like it best (view providing the API). Disadvantage: same as above, but deleting the session object in an even later stage.