Search code examples
c#pbkdf2

PBKDF2 is not returning plaintext and hashed expected value in C#


I'm trying to use PBKDF2 in C# to create a password, then I'm trying to retrieve that password.

var masterPwd = "masterPassword";
var service = "www.google.com";
byte[] salt = CreateSalt(16);
var encodedPwd = CreateMasterPassword(masterPwd, salt);
var decoded = CreateMasterPassword(encodedPwd, salt);

With the following functions defined:

        public static byte[] CreateSalt(int size)
        {
            var salt = new byte[size];
            using (var random = new RNGCryptoServiceProvider())
            {
                random.GetNonZeroBytes(salt);
            }

            return salt;
        }

        public static string CreateMasterPassword(string password, byte[] salt)
        {

            string PassHash = Convert.ToBase64String(KeyDerivation.Pbkdf2(
            password: password,
            salt: salt,
            prf: KeyDerivationPrf.HMACSHA256,
            iterationCount: 10000,
            numBytesRequested: 256 / 8));
            return PassHash;
        }

In this case, shouldn't decoded be the same as masterPwd?


Solution

  • I think you have a bit of a misunderstanding about what PBKDF2 does. It is not an encryption function where you can ever recover the plaintext data (let's put brute force aside as it is not an 'intended use'). Rather, it is a "slow" hashing mechanism, often described as "one way".

    PBKDF2 is a key derivation function, but is also used for storing passwords. Here's a typical flow for PBKDF2 when used for password storage.

    1. A user creates an account with a website with a password. The site generates a random salt, then applies PBKD2 to the password with the salt, and stores the result and the salt. The salt is stored in plain text.
    2. When the user needs to log in again, the site asks for the username and password. It looks up the salt for that user, then it re-applied PBKDF2 to the password the user entered.
    3. It compares the stored hash with the hash of what the user entered. If the hashes are equal, the site knows they typed the password correctly.

    This approach means the site does not store the password in a way that it can possibly know. This allows the site to disavow knowledge of the password.

    If that is what you want to do, then that is how you should use it.

    If you do need a way to have a "two way" algorithm, then this goes from hashing to encryption. A symmetric algorithm would be used in this place, with all of the troublesome issues of key and IV management. You would most likely want to take a look at a high abstraction that is built on top of symmetric ciphers like libsodium.

    libsodium is a nice abstraction built on top of primitives that takes the guess work out of how to use them. If offers simple APIs such as "encrypt this thing with this password" and it correctly derives an encryption key from the password, performs some form of authentication on the encryption, and is regarded well by information security experts.