Search code examples
c#randomcryptographyunique

Cryptograhically random unique strings


In this answer, the below code was posted for creating unique random alphanumeric strings. Could someone clarify for me how exactly they are ensured to be unique in this code and to what extent these are unique? If I rerun this method on different occasions would I still get unique strings?

Or did I just misunderstand the reply and these are not generating unique keys at all, only random?

I already asked this in a comment to that answer but the user seems to be inactive.

    public static string GetUniqueKey()
    {
        int maxSize = 8;
        char[] chars = new char[62];
        string a;
        a = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890";
        chars = a.ToCharArray();
        int size = maxSize;
        byte[] data = new byte[1];
        RNGCryptoServiceProvider crypto = new RNGCryptoServiceProvider();
        crypto.GetNonZeroBytes(data);
        size = maxSize;
        data = new byte[size];
        crypto.GetNonZeroBytes(data);
        StringBuilder result = new StringBuilder(size);
        foreach (byte b in data)
        { result.Append(chars[b % (chars.Length - 1)]); }
        return result.ToString();
    }   

Solution

  • There is nothing in the code that guarantees that the result is unique. To get a unique value you either have to keep all previous values so that you can check for duplicates, or use a lot longer codes so that duplicates are practically impossible (e.g. a GUID). The code contains less than 48 bits of information, which is a lot less than the 128 bits of a GUID.

    The string is just random, and although a crypto strength random generator is used, that is ruined by how the code is generated from the random data. There are some issues in the code:

    • A char array is created, that is just thrown away and replaced with another.
    • A one byte array of random data is created for no apparent reason at all, as it's not used for anything.
    • The GetNonZeroBytes method is used instead of the GetBytes method, which adds a skew to the distribution of characters as the code does nothing to handle the lack of zero values.
    • The modulo (%) operator is used to reduce the random number down to the number of characters used, but the random number can't be evenly divided into the number of characters, which also adds a skew to the distribution of characters.
    • chars.Length - 1 is used instead of chars.Length when the number is reduced, which means that only 61 of the predefined 62 characters can occur in the string.

    Although those issues are minor, they are important when you are dealing with crypo strength randomness.

    A version of the code that would produce a string without those issues, and give a code with enough information to be considered practically unique:

    public static string GetUniqueKey() {
      int size = 16;
      byte[] data = new byte[size];
      RNGCryptoServiceProvider crypto = new RNGCryptoServiceProvider();
      crypto.GetBytes(data);
      return BitConverter.ToString(data).Replace("-", String.Empty);
    }