Search code examples
djangodjango-modelsdeadlockdatabase-deadlocks

Drawbacks of custom save method to avoid user getting a userprofile_id with a specific value. (Django/Heroku)


In my Django project, I need to avoid users creating a userprofile_id with a value of 3 (3, 13, 203...).

This is an interim workaround another issue which will probably take longer to resolve.

I posted the code below.

But my question relates more to the drawbacks of this interim fix.

What sort of problem could this generate. This is the list I came up with and my answers to each of them:

  1. User_id will be different than userprofile_id - cant see any problems here.
  2. Affect to database integrity when custom method is removed - I cannot see how the remove of the method would affect the database. Hopefully I am right.;
  3. Some performance issue - The save is at the database level which might cause some delays I suppose. This is for a pilot+. I dont expect 1000s of new users a day. Could two users who signup at the same time end up with the same userprofile_id? Would there be a way to limit potential of deadlocks?

the code for those who are curious:

class UserProfile(models.Model):
    user = models.OneToOneField(User, null=True, on_delete=models.CASCADE)

    def __str__(self):
        return str(self.user)
    
    def save(self, *args, **kwargs):
        if not self.pk:
            # Logic for generating userprofile_id without '3'
            last_userprofile = UserProfile.objects.all().order_by('-id').first()
            print(f'last_userprofile: {last_userprofile.id}')
            if last_userprofile:
                last_id = last_userprofile.id
                print(f'last_id: {last_id}')
                next_id = last_id + 1
                print(f'next_id: {next_id}')
                while '3' in str(next_id):
                    next_id += 1
                self.id = next_id
            else:
                self.id = 1 

Solution

  • With your solution it is possible for multiple users to end up having the same profile when you create users concurrently.

    This happens because of how the save method decides whether to make an update or an insert query. From the documentation the following is what Django does:

    • If the object’s primary key attribute is set to a value that evaluates to True (i.e., a value other than None or the empty string), Django executes an UPDATE.
    • If the object’s primary key attribute is not set or if the UPDATE didn’t update anything (e.g. if primary key is set to a value that doesn’t exist in the database), Django executes an INSERT.

    Given you've set the primary key yourself it is possible that:

    1. Request 1 to create user profile for a user starts.
    2. Request 2 to create user profile for another user starts.
    3. Both get the same calculated id.
    4. One of them finishes the insert first.
    5. The other ends up updating the newly inserted row.

    This results in two users having the same profile. You can work around this problem by forcing Django to perform an insert by setting the force_insert keyword argument to True:

    def save(self, *args, **kwargs):
        if not self.pk:
            # Logic for generating userprofile_id without '3'
            last_userprofile = UserProfile.objects.all().order_by('-id').first()
            print(f'last_userprofile: {last_userprofile.id}')
            if last_userprofile:
                last_id = last_userprofile.id
                print(f'last_id: {last_id}')
                next_id = last_id + 1
                print(f'next_id: {next_id}')
                while '3' in str(next_id):
                    next_id += 1
                self.id = next_id
            else:
                self.id = 1
            # Force the save to be an insert, an error is better than two users having same profile
            kwargs["force_insert"] = True
    

    This will result in one of the saves erroring out but that's better than two users having the same profile. If needed you can gracefully handle this by catching the exception and handling it.

    Another option is to actually solve this by correctly utilizing the database and using transactions. Django's default behaviour is to autocommit each query. You can instead either explicitly create a transaction by using transaction.atomic or tie each HTTP request to a transaction by setting ATOMIC_REQUESTS to True. Along with this to ensure that appropriate locks are created in the database use select_for_update on your query to get the last UserProfile object:

    UserProfile.objects.select_for_update().order_by('-id').first()
    

    Note: I don't recommend doing what you are doing, it is best to leave the primary generation up to the database. Treat this only as an interim workaround like you have mentioned and remove it as soon as possible.

    Note: Once you work on removing this workaround the sequence generating the IDs would end up producing invalid IDs. Check this question out to solve that problem.