python puzzle combinatorics discrete-mathematics dice

Combinatorics Counting Puzzle: Roll 20, 8-sided dice, what is the probability of getting at least 5 dice of the same value

Assume a game in which one rolls 20, 8-sided die, for a total number of 8^20 possible outcomes. To calculate the probability of a particular event occurring, we divide the number of ways that event can occur by 8^20.

One can calculate the number of ways to get exactly 5 dice of the value 3. (20 choose 5) gives us the number of orders of 3. 7^15 gives us the number of ways we can not get the value 3 for 15 rolls.

number of ways to get exactly 5, 3's = (20 choose 5)*7^15.

The answer can also be viewed as how many ways can I rearrange the string 3,3,3,3,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 (20 choose 5) times the total number of values we the zero's (assuming 7 legal values) 7^15 (is this correct).

Question 1: How can I calculate the number of ways to get exactly 5 dice of the same value(That is, for all die values). Note: if I just naively use my first answer above and multiply bt 8, I get an enormous amount of double counting?

I understand that I could solve for each of the cases (5 1's), (5, 2's), (5, 3's), ... (5's, 8) sum them (more simply 8*(5 1's) ). Then subtract the sum of number of overlaps (5 1's) and (5 2's), (5 1's) and (5 3's)... (5 1's) and (5, 2's) and ... and (5, 8's) but this seems exceedingly messy. I would a generalization of this in a way that scales up to large numbers of samples and large numbers of classes.
How can I calculate the number of ways to get at least 5 dice of the same value?

So 111110000000000000000 or 11110100000000000002 or 11111100000001110000 or 11011211222222223333, but not 00001111222233334444 or 000511512252363347744.

I'm looking for answers which either explain the math or point to a library which supports this (esp python modules). Extra points for detail and examples.

Solution

Double counting can be solved by use of the Inclusion/Exclusion Principle

I suspect it comes out to:

Choose(8,1)*P(one set of 5 Xs) 
- Choose(8,2)*P(a set of 5 Xs and a set of 5 Ys) 
+ Choose(8,3)*P(5 Xs, 5 Ys, 5 Zs) 
- Choose(8,4)*P(5 Xs, 5 Ys, 5 Zs, 5 As)

P(set of 5 Xs) = 20 Choose 5 * 7^15 / 8^20
P(5 Xs, 5 Ys) = 20 Choose 5,5 * 6^10 / 8^20

And so on. This doesn't solve the problem directly of 'more then 5 of the same', as if you simply summed the results of this applied to 5,6,7..20; you would over count the cases where you have, say, 10 1's and 5 8's.

You could probably apply inclusion exclusion again to come up with that second answer; so, P(of at least 5)=P(one set of 20)+ ... + (P(one set of 15) - 7*P(set of 5 from 5 dice)) + ((P(one set of 14) - 7*P(one set of 5 from 6) - 7*P(one set of 6 from 6)). Coming up with the source code for that is proving itself more difficult.