Search code examples
djangodatabaseauthenticationdjango-modelslegacy-database

Django running migrations on legacy database


I am working with a legacy database and I have created a custom user model. I am working to set up the register and authentication funcs. I have created the user manager and in user model there are some fields that I have added for django like is_staff, is_active, date_joined. When I run migrations, the legacy table still does not have the columns I have added in the model. Should It actually alter the legacy database?

class TbUser(AbstractBaseUser, PermissionsMixin):
    id = models.CharField(primary_key=True, max_length=40)
    usname = models.CharField(max_length=40, blank=True, null=True, unique=True)
    psword = models.CharField(max_length=255, blank=True, null=True)
  
    # added columns
    is_staff = models.BooleanField(default=False)
    is_active = models.BooleanField(default=True)
    date_joined = models.DateTimeField(default=timezone.now)

    objects = TbUserManager()

    USERNAME_FIELD = 'usname'
    REQUIRED_FIELDS = []

    class Meta:
        managed = False
        db_table = 'tb_user'

Also, when I am creating the superuser, I get the following error

django.db.utils.OperationalError: (1054, "Unknown column 'tb_user.password' in 'field list'")

Although The user manager looks like this

class TbUserManager(BaseUserManager):
    
    def create_user(self, email, psword=None, **kwargs):
        if not email:
            raise ValueError('Users must have a valid email address.')
        if not kwargs.get('usname'):
            raise ValueError('Users must have a valid username.')
        user = self.model(
            email=self.normalize_email(email), usname=kwargs.get('usname')
        )
        user.psword(psword)
        user.save()
        return user

    def create_superuser(self, email, psword, **kwargs):
        user = self.create_user(email, psword, **kwargs)

        user.is_superuser = True
        user.save()

        return user

I really dont know where the error found the tb_user.password because I have renamed all to psword

If you need some details feel free to ask.

EDIT:

I found that password error is due to model naming psword, is there a way to tell django that this is the password field? ex: USER_PASSWORD='psword'


Solution

  • The error you're getting is caused by the fact that password field (as well as last_login field) is already defined in the AbstractBaseUser and if you want to use Django validation, you have to use that column for it to work properly.

    You'll be better of converting that legacy database into the form compatible with Django auth system.

    The 2nd issue you'll probably face is how Django stores passwords. As you already have existing data, user passwords may be already stored in the database in some form. You can fix it in 3 different ways.

    1. Convert your passwords into the form Django supports.

    If your legacy system was storing all the passwords in plain text or using any method already supported by Django, you can just convert your passwords into one of them.

    I recommend using only one of the hashers enabled by default or choosing another solution from ones described below.

    2. Create your own password hasher that will encapsulate your existing passwords inside one enabled by default in Django.

    For example, if your legacy method uses md5 to hash the passwords, you can write your own password hasher, that will take the existing encoded password and put it inside one of the existing password hashers (like django.contrib.auth.hashers.PBKDF2PasswordHasher).

    Then, your settings may look like:

    PASSWORD_HASHERS = [
        'django.contrib.auth.hashers.PBKDF2PasswordHasher',
        'django.contrib.auth.hashers.PBKDF2SHA1PasswordHasher',
        'django.contrib.auth.hashers.Argon2PasswordHasher',
        'django.contrib.auth.hashers.BCryptSHA256PasswordHasher',
        'your_auth_app.hashers.PBKDF2LegacyPasswordHasher',
    ]
    

    With that configuration, Django will know how to validate the password provided by user, but it also will automatically convert his password into the first hasher from the list when user logs in (as Django will have access to the plaintext password, it can encode it on its own)

    3. Force every user to reset the password before accessing the new system.

    Simply add a ! character in the front of the old password, so Django knows it cannot be used to log in. Users will still be able to reset their passwords using the password reset email feature.


    As you've provided an example of the existing password hash from your old system, I'm adding an example more relevant to your case.

    $2a at the beginning of the password indicates bcrypt. But how exactly it is computed in your old system can't be easily determined and has to be looked up in the code of your old system as the bcrypt may be used on top of another algorithm or any other password manipulation scheme.

    But as the bcrypt itself is secure enough, you don't need to wrap this type of hashes in another layer of hashing. It may be required though to implement your own password hasher if your previous system doesn't simply pass it through bcrypt or first through the sha256 and then through the bcrypt.

    If it is hard for you to determine it by reading the code of the old system, or you can't do it at all, you can check it by trial an error. To check for the 2 variants that are built-into Django, first install the required libraries by installing django[bcrypt].

    Next, make sure both hashers you're about to test are enabled in your Django settings, preferably set your PASSWORD_HASHERS setting as:

    PASSWORD_HASHERS = [
        'django.contrib.auth.hashers.PBKDF2PasswordHasher',
        'django.contrib.auth.hashers.PBKDF2SHA1PasswordHasher',
        'django.contrib.auth.hashers.Argon2PasswordHasher',
        'django.contrib.auth.hashers.BCryptSHA256PasswordHasher',
        'django.contrib.auth.hashers.BCryptPasswordHasher',
    ]
    

    Next, create a new user account in your old system (or get any existing one for which you know the password), copy the password hash from it and add a bcrypt$ prefix in front of it (the final form of this hash should start with bcrypt$$2a$ after this modification. Note the double $ symbol). With that prepared hash, open the management console using ./manage.py shell and execute:

    from django.contrib.auth.hashers import check_password
    
    check_password('your known password', 'your_modified_password_hash')
    

    If this function returns True, your old system used plain bcrypt. If it results in False, replace bcrypt with bcrypt_sha256 to check the 2nd password hash and execute the check_password again. If it succeeds, your old system wraps the password with sha256 before passing it to the bcrypt.

    If either test succeeded, all you need to do is to add a bcrypt$ or bcrypt_sha256$ prefix respectively, depending on your test results, to all password hashes from the old system. If none of them worked, you need to look at your old system code to determine the exact hashing method.

    If your old system was using bcrypt with sha256 (and all you need to do with the old data is to add the bcrypt_sha256$ prefix), I recommend reverting the PASSWORD_HASHERS to the default value. If you want to keep using the old hashing method, you can move your method to the top of the PASSWORD_HASHERS list, so Django will create any new account using this method and it won't change the old password hashes to another method when user logs in using his password.


    A side note about the BCryptSHA256PasswordHasher. Django uses it in the default password hashers suite because a plain bcrypt has a max password length limit. Passing it through the SHA256 first ensures that the input to the bcrypt algorithm never exceeds that limit regardless of the actual password length provided by user.