Search code examples
fastapipasslib

How does CryptContext hashing know what secret to use?


I have the following code snippet:

from passlib.context import CryptContext

pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
pwd_context.hash(password)

Which is described here.

What i don't understand is, how can this be secure if it returns the same hashed password all the time without considering another secret_key for example to hash the password value?


Solution

  • Your assumption that it returns the same hashed password all the time without considering another "secret" (well, it's not really secret) is wrong; you'll see this if you run pwd_context.hash multiple times:

    >>> from passlib.context import CryptContext
    >>>
    >>> pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
    >>> pwd_context.hash("test")
    '$2b$12$0qdOrAMoK7dgySjmNbyRpOggbk.IM2vffMh8rFoITorRKabyFiElC'
    >>> pwd_context.hash("test")
    '$2b$12$gqaNzwTmjAQbGW/08zs4guq1xWD/g7JkWtKqE2BWo6nU1TyP37Feq'
    

    These two hashes are, as you can see, not the same - even when given the same password. So what's actually going on?

    When you don't give hash an explicit salt (the secret "key" you're talking about) one will be generated for you by passlib. It's worth pointing out that hashing is NOT the same as encryption, so there is no key to talk about. Instead you'll see salt mentioned, which is a clear text value that is used to make sure that the same password hashed twice will give different results (since you're effectively hashing salt + password instead).

    So why do we get two different values? The salt is the first 22 characters of the actual bcrypt value. The fields are separated by $ - 2b means bcrypt, 12 means 12 rounds, and the next string is the actual resulting value stored for the password (salt+resulting bcrypt hash). The first 22 characters of this string is the salt in plain text.

    You can see this if you give bcrypt a salt instead of letting it generate one (the last character has to be one of [.Oeu] to match the expected bitpadding of some bcrypt implementations - passlib will otherwise throw an error or a warning - the other characters has to match the regex character class of [./A-Za-z0-9]):

    >>> pwd_context.hash("test", salt="a"*21 + "e")
    '$2b$12$aaaaaaaaaaaaaaaaaaaaaehsFuAEeaAnjmdgkAxYfzHEipCaNQ0ES'
            ^--------------------^
    

    If we explicitly give the same hash, the result should be the same (and is how you can verify the password later):

    >>> pwd_context.hash("test", salt="a"*21 + "e")
    '$2b$12$aaaaaaaaaaaaaaaaaaaaaehsFuAEeaAnjmdgkAxYfzHEipCaNQ0ES'
    >>> pwd_context.hash("test", salt="a"*21 + "e")
    '$2b$12$aaaaaaaaaaaaaaaaaaaaaehsFuAEeaAnjmdgkAxYfzHEipCaNQ0ES'
    

    This same is the case for the previous hashes:

    >>> pwd_context.hash("test")
    '$2b$12$gqaNzwTmjAQbGW/08zs4guq1xWD/g7JkWtKqE2BWo6nU1TyP37Feq'
            ^--------------------^
    

    This is the actual generated salt, which is then used together with test to create the actual hash:

    >>> pwd_context.hash("test")
    '$2b$12$gqaNzwTmjAQbGW/08zs4guq1xWD/g7JkWtKqE2BWo6nU1TyP37Feq'
                                  ^-----------------------------^
    

    So why do we use this salt when it's clearly visible to everyone? It makes it impossible to just scan through the a list of hashes for known hashes - since test in your list will have a different values than test in the list you're comparing it to (because of different salts), you'll have to actually test the guessed passwords together with their salt and run them through the hashing algorithm. bcrypt is explicitly designed to make that process take time, so you'll spend far longer trying to crack a password than just scan through a list of 200 million passwords and search for the known hash in a database.

    It'll also make sure that two users with the same password won't receive the same password hash, so you can't quickly determine weak passwords by looking for password hashes that repeat among multiple users (or try to determine if two users is the same individual because they have the same password).

    So what do you do when computers gets even faster? You increase the 12 parameter - the rounds - this increases the runtime of the hashing alogrithm, hopefully staying safer for even longer (you can experiment with the rounds parameter to passlib.hash).