Search code examples
securityhashtheory

Security: longer keys versus more available characters


I apologize if this has been answered before, but I was not able to find anything. This question was inspired by a comment on another security-related question here on SO:

How to generate a random, long salt for use in hashing?

The specific comment is as follows (sixth comment of accepted answer):

...Second, and more importantly, this will only return hexadecimal characters - i.e. 0-9 and A-F. It will never return a letter higher than an F. You're reducing your output to just 16 possible characters when there could be - and almost certainly are - many other valid characters.

– AgentConundrum Oct 14 '12 at 17:19

This got me thinking. Say I had some arbitrary series of bytes, with each byte being randomly distributed over 2^(8). Let this key be A. Now suppose I transformed A into its hexadecimal string representation, key B (ex. 0xde 0xad 0xbe 0xef => "d e a d b e e f").

Some things are readily apparent:

  • len(B) = 2 len(A)
  • The symbols in B are limited to 2^(4) discrete values while the symbols in A range over 2^(8)
  • A and B represent the same 'quantities', just using different encoding.

My suspicion is that, in this example, the two keys will end up being equally as secure (otherwise every password cracking tool would just convert one representation to another for quicker attacks). External to this contrived example, however, I suspect there is an important security moral to take away from this; especially when selecting a source of randomness.

So, in short, which is more desirable from a security stand point: longer keys or keys whose values cover more discrete symbols?

I am really interested in the theory behind this, so an extra bonus gold star (or at least my undying admiration) to anyone who can also provide the math / proof behind their conclusion.


Solution

  • Let's start with a binary string of length 8. The possible combinations are all permutations from 00000000 and 11111111. This gives us a keyspace of 2^8, or 256 possible keys. Now let's look at option A:

    A: Adding one additional bit. We now have a 9-bit string, so the possible values are between 000000000 and 111111111, which gives us a keyspace size of 2^9, or 512 keys. We also have option B, however.

    B: Adding an additional value to the keyspace (NOT the keyspace size!): Now let's pretend we have a trinary system, where the accepted numbers are 0, 1, and 2. Still assuming a string of length 8, we have 3^8, or 6561 keys...clearly much higher.

    However! Trinary does not exist!

    Let's look at your example. Please be aware I will be clarifying some of it, which you may have been confused about. Begin with a 4-BYTE (or 32-bit) bitstring: 11011110 10101101 10111110 11101111 (this is, btw, the bitstring equivalent to 0xDEADBEEF)

    Since our possible values for each digit are 0 or 1, the base of our exponent is 2. Since there are 32 bits, we have 2^32 as the strength of this key. Now let's look at your second key, DEADBEEF. Each "digit" can be a value from 0-9, or A-F. This gives us 16 values. We have 8 "digits", so our exponent is 16^8...which also equals 2^32! So those keys are equal in strength (also, because they are the same thing).

    But we're talking about REAL passwords, not just those silly little binary things. Consider an alphabetical password with only lowercase letters of length 8: we have 26 possible characters, and 8 of them, so the strength is 26^8, or 208.8 billion (takes about a minute to brute force). Adding one character to the length yields 26^9, or 5.4 trillion combinations: 20 minutes or so. Let's go back to our 8-char string, but add a character: the space character. now we have 27^8, which is 282 billion....FAR LESS than adding an additional character!

    The proper solution, of course, is to do both: for instance, 27^9 is 7.6 trillion combinations, or about half an hour of cracking. An 8-character password using upper case, lower case, numbers, special symbols, and the space character would take around 20 days to crack....still not nearly strong enough. Add another character, and it's 5 years.

    As a reference, I usually make my passwords upwards of 16 characters, and they have at least one Cap, one space, one number, and one special character. Such a password at 16 characters would take several (hundred) trillion years to brute force.