Search code examples
mathpasswordsprobability

Probability of 3-character string appearing in a randomly generated password


If you have a randomly generated password, consisting of only alphanumeric characters, of length 12, and the comparison is case insensitive (i.e. 'A' == 'a'), what is the probability that one specific string of length 3 (e.g. 'ABC') will appear in that password?

I know the number of total possible combinations is (26+10)^12, but beyond that, I'm a little lost. An explanation of the math would also be most helpful.


Solution

  • The string "abc" can appear in the first position, making the string look like this:

    abcXXXXXXXXX
    

    ...where the X's can be any letter or number. There are (26 + 10)^9 such strings.

    It can appear in the second position, making the string look like:

    XabcXXXXXXXX
    

    And there are (26 + 10)^9 such strings also.

    Since "abc" can appear at anywhere from the first through 10th positions, there are 10*36^9 such strings.

    But this overcounts, because it counts (for instance) strings like this twice:

    abcXXXabcXXX
    

    So we need to count all of the strings like this and subtract them off of our total.

    Since there are 6 X's in this pattern, there are 36^6 strings that match this pattern.

    I get 7+6+5+4+3+2+1 = 28 patterns like this. (If the first "abc" is at the beginning, the second can be in any of 7 places. If the first "abc" is in the second place, the second can be in any of 6 places. And so on.)

    So subtract off 28*36^6.

    ...but that subtracts off too much, because it subtracted off strings like this three times instead of just once:

    abcXabcXabcX
    

    So we have to add back in the strings like this, twice. I get 4+3+2+1 + 3+2+1 + 2+1 + 1 = 20 of these patterns, meaning we have to add back in 2*20*(36^3).

    But that math counted this string four times:

    abcabcabcabc
    

    ...so we have to subtract off 3.

    Final answer:

    10*36^9 - 28*36^6 + 2*20*(36^3) - 3
    

    Divide that by 36^12 to get your probability.

    See also the Inclusion-Exclusion Principle. And let me know if I made an error in my counting.