Search code examples
djangopostgresqltransaction-isolation

How to safely and atomically decrement a counter with Django and PostgreSQL?


I've been reading up on PostgreSQL transaction isolation and how that relates to Django's transaction.atomic() (e.g. this article, PostgreSQL docs), but I'm far from fluent in this topic and I'm not sure I understand what I've read.

We've got a PostgreSQL-backed Django app that involves quota objects. Simplified, it's just this:

class Quota(models.Model):
    obj = models.OneToOneField(AnotherModel)
    count = models.PositiveIntegerField()

An instance of this controls how many times a certain operation can be performed against the obj instance. count is initialized to a certain number, and will only ever decrement until it hits zero.

Any number of processes/threads can concurrently perform these operations. Basically, we need to atomically decrement (with UPDATE) the count of a single database row without deadlocking and without two processes/threads ever e.g. starting with a count of 100 and both trying to decrement it to 99.

My naive approach would be this:

with transaction.atomic():
    cursor = connection.cursor()
    cursor.execute('SET TRANSACTION ISOLATION LEVEL SERIALIZABLE')
    Quota.objects.filter(obj=instance).update(count=F('count')-1)

However, I'm not sure if this is subject to this issue, from the linked article:

if at COMMIT the database cannot determine that the transaction could have been performed serially with respect to the read/writes of other transactions, then it will fail with a django.db.DatabaseError. This can happen even if they updated different rows.

All the processes/threads performing operations against the same obj would be decrementing the same column of the same row, so... maybe? I don't actually know what's involved in PostgreSQL "determin[ing] that the transaction could have been performed serially".

An alternate approach could be:

with transaction.atomic():
    Quota.objects.select_for_update().filter(obj=instance).update(count=F('count')-1)

This seems to do row-level locking, and my understanding is that the isolation level change isn't needed, but I don't know if this is sufficient for correct handling of concurrent operations.

Is one of these approaches preferrable here, and are some modifications still necessary to guarantee atomicity and deadlock avoidance? We could use something like python-redis-lock to also prevent concurrent DB operations at the Django view level, but this feels like a more natural fit to do at the DB level.


Solution

  • I cannot tell you what to do in Django, but I can explain it on the SQL level.

    You can never deadlock by modifying only a single row. All you can ever get is a live lock, where one updating transaction has to wait until the previous one commits. This live lock cannot be avoided, it is the way that the database serializes modifications of the same row.

    The only way you can get a deadlock is if several database transactions try to lock the same objects (plural!) in a different order.

    Here are a few tips that you can use to avoid problems:

    • Keep your database transactions as short as possible, so that nobody has to wait long for a lock. That also reduces the risk of deadlocks.

    • Don't modify more data in a single transaction than is absolutely necessary for consistency. The more rows you modify (lock), the greater the risk of a deadlock.

    • Update the counter as the last activity before you commit (or as late as possible), so that the row is locked as short as possible. If you always update the counter as the last activity, you can never run into a deadlock on account of that update!

    • Use a check constraint on the column if you want to make sure that is never exceeds a certain value.