Search code examples
djangosqlitedjango-sessionsfly

Django sessions failing intermittently


I wrote an app last year for a community event I run, called Nashville Tabletop Day. The app allows people to scan QR codes to interact with games they can play during the day. Locally it runs great. However when I push it to Fly.io it runs, but reliably drops sessions. By that I mean I can log in, browse to a page or two, and then the session just vanishes. It even happens if I stay on the same page and just reload a few times.

It doesn’t happen immediately, or I’d think perhaps the database was the issue. But after 15-30 seconds, or a few page loads—poof. The event is this coming Saturday and this app has to be ready go.

I’m tempted to think it could be a config issue, but this app ran last year, almost flawlessly and I’ve only made a few small updates this year.

Django 4.1.7
Python 3.8.18
Database is SQLite

INSTALLED_APPS = [
    'ntd.apps.NtdConfig',
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'whitenoise.runserver_nostatic',
    'django.contrib.staticfiles',
]

MIDDLEWARE = [
    'django.middleware.security.SecurityMiddleware',
    'whitenoise.middleware.WhiteNoiseMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    'django.middleware.clickjacking.XFrameOptionsMiddleware',
    'ntd.middleware.request_logger.RequestLoggerMiddleware',
]

Some debugging I’ve done:

I added a simple logging middleware and print the request.session.session_key and request.path. Locally, each reload shows the attendee_uuid (the value I care about), and those other two values. When I push that code to Fly, all three values print out a few times, then start printing None.

I also tried reverting to the branch I used at last year’s event, since I know that code worked. I just pushed it up to Fly to test, and I’m getting the same behavior. Data loads from the database which allows me to browse some of the public pages, but it won’t maintain session, which means it’s no good for user-specific functionality.

A bit of extra information: I just noticed that Django admin is logging me out as well.

So the session is definitely the issue. Please help!


Solution

  • The issue is in the fact that fly.io automatically scales he app horisontally, and places it behind their load balancer.

    If session store you are using is exclusive to a single instance of the app, such as file store on a non-shared file system, a database store if each app instance uses a separate database etc, then the session would seem to randomly disappear if the load balancer happened to route the request to a Machine different than the one where the session was created.

    The long-term solution is to use a single database for all instances of your app; I would suggest Fly Postgress, but fly.io gives a rather comprehensive overview of options here.

    The short-term solution is to prevent app scaling; if there is never more than one instance of the app, there is no problem. For many users this is also acceptable, if their apps do not need scaling, and they are not worried about lack of redundancy.