Search code examples
encryptionaespbkdf2

Why brute force the key instead of doing it on the pbkdf2 passphrase itself?


I have created a binary executable using a C++ program which take a random password which is of 32 characters as command line argument, convert that into raw bytes and pass it to a AES 256 bit encryption function to encrypt few files. But I received a review from an online form that this is bad and I should consider to use PBDFK2 to generate a key from my password and then encrypt the files using that.

What I couldn't wrap my head around is, what is the purpose of using PBKDF2 if all it does is also generate a 32 bytes random string which I have to convert to raw bytes anyways and pass it to the function. Anywhere I look on the internet, people are advising to hash the passwords with PBKDF2 and then using the key it generates to do cipher operations.

I understand that PBKDF2 generates thie kind of pseudo random keys which are complicated more than passwords created by humans. But, by the end of the day, if the attacker knows which Encryption algorithm I am using say he/she knows I am using AES-256 CBC, PBKDF2 with a particular no of iterations, the salt and IV stored along with cipher text etc. If all of these parameters are known, then what's stopping him/her from directly brute force my weak password which I pass to the hashing function?

If my PBKDF2 password itself is weak, say "apple123" and it generates a key say "HfhagG28@₹&(@(/#+₹+" something, what good does it do? The attacker could simply brute force for "apple123", and whichever key that generates use that and try see if it decrypts.


Solution

  • On the other hand, if I use a strong and proper random Password, I would not need something like PBKDF2 before I encrypt my files isn't it?

    Correct. If your password were 64-characters long, made up of completely random hexadecimal digits, or about 39-characters long, made up of completely random characters than be typed on a common keyboard. Most people's "strong" passwords are nowhere near these lengths, and "strong" passwords almost never are random over all reasonably-typeable characters.

    The output of PBKDF2("apple123") is nowhere near as trivial as HfhagG28@₹&(@(/#+₹+. You're correct, that if it were that trivial (~23 non-random bytes), then it wouldn't be particularly useful. But it isn't. It outputs 32 bytes that are indistinguishable from random. 2256 is much, much, much larger than 2184, even if the output you're describing were fully random. The fact that ₹, +, (, and @, all appear twice in less than 2 dozen characters suggests this isn't very random. And the fact that it's printable in UTF-8 at all means it's over a tiny space compared to a full AES key. The overwhelming majority of 32-byte sequences are not valid UTF-8.

    Brute forcing the PBKDF2 space is incredibly more expensive than brute forcing the underlying space. By orders of magnitude. Against strong passwords, you're better off brute forcing the key rather than the password. (Not that brute forcing the key is possible.)

    But you are correct that for a very weak password, it is certainly possible to guess it in reasonable time, even with PBKDF2. PBKDF2 can't protect the weakest of passwords. But it can make reasonably strong passwords effectively as good as truly random passwords, and that's the point.