Search code examples
pythonbcrypt

Why would "$2a" vs "$2b" matter?


I was trying to write a script to add users into a MySQL database. I finally get all of the users into the database, but then the login of my app won't authorize them. I try a bunch of different solutions and then I notice the older user's passwords start with "$2a" and the ones I've added are "$2b". So I insert the code below.

password = bcrypt.hashpw(password.encode("UTF-8"), bcrypt.gensalt(11))
password = password.decode("UTF-8")
password = password[:2] + "a" + password[3:] #Why does this work??

Suddenly I can login no problem. So why would having "$2a" work and not "$2b"? The web app isn't mine and I can't find the code where it checks the password. If it helps, the webapp was made in Java and uses spring for validation.


Solution

  • Here's Wikipedia on Bcrypt:

    $2$ (1999)

    The original Bcrypt specification defined a prefix of $2$. This follows the Modular Crypt Format [...]

    $2a$

    The original specification did not define how to handle non-ASCII character, nor how to handle a null terminator. The specification was revised to specify that when hashing strings:

    • the string must be UTF-8 encoded
    • the null terminator must be included
    • With this change, the version was changed to $2a$

    $2b$ (February 2014)

    A bug was discovered in the OpenBSD implementation of bcrypt. They were storing the length of their strings in an unsigned char (i.e. 8-bit Byte). If a password was longer than 255 characters, it would overflow and wrap at 255.

    You are adding the new format, while the program only supports validating the old format.

    Since the new and old format are compatible for passwords < 255 chars, switching the header works. However, if you ever try to add a password >= 256 chars this way, it'll be rejected as invalid.