Search code examples
pythonrandomintegerprobabilitydice

Expand random range of integers using probability distribution


I tried to solve the classic problem of generating a random integer between 1 and 7, given a function that generates a random integer between 1 and 5. My approach was to add the result of 2 calls to rand5(), effectively turning this into a "sum of dice rolls" problem. The probability of the sum dice rolls is fairly easy to calculate, so I used it here. An explanation of this is after the code

My question is: How can I calculate what the values of counter should be? The current values are incorrect, as verified by experiment. Do integer values exist that satisfy the probability? And is there a better way to solve this problem with this approach?

def rand5():
    return random.randint(1,5)

def rand7():
    counter = [1,2,3,4,5,4,3]
    while 0 not in counter:
        sum = rand5() + rand5() - 2
        if sum <= 6:
            counter[sum] -= 1
    return counter.index(0) + 1

For reference, the following code appears to create a random distribution.

test_counter = [0,0,0,0,0,0,0,0,0,0,0,0]
for i in range(500000):
    test_counter[rand5() + rand5() - 2] += 1

test_counter[0] *= 60
test_counter[1] *= 30
test_counter[2] *= 20
test_counter[3] *= 15
test_counter[4] *= 12
test_counter[5] *= 15
test_counter[6] *= 20
test_counter[7] *= 0
test_counter[8] *= 0

print(test_counter)

Explanation of probability: The probability of the dice rolls can be calculated by listing the possible combinations of dice. For this problem, the numbers generated by each die (the rand5 function) would be:

{(1,1), (1,2), (1,3), (1,4), (1,5), (2,1), (2,2), ..., (5,5)}

The probability of each sum is the number of ways the sum appears in the list, divided by the total number of items in the list. The list has 5^2 = 25 total elements. For example, a sum of 4 can be achieved by the following combinations {(1,3), (2,2), (3,1)}, so the probability of a sum of 4 is 3/25.

The probability of each result is then:

  1. 1/25
  2. 2/25
  3. 3/25
  4. 4/25
  5. 5/25
  6. 4/25
  7. 3/25
  8. 2/25
  9. 1/25

I tried to use this distribution to generate a uniform distribution by having the more common ones have to be generated multiple times, and this is stored in counter.


Solution

  • Not sure going via dice distribution is good idea. In general, if you have random bits source, but short sequences, better combine and chop bits to make up longer random bits sequence. Along the lines

    import random
    
    def rand5():
        return random.randint(1, 5)
    
    def twobits():
        q = rand5() - 1 # [0...5) range
        while q == 4: # dropping high bit
            q = rand5() - 1
        return q # [0...3) range, two random bits
    
    def onebit():
        return twobits() & 1
    
    def rand7():
        q = onebit() << 2 | twobits() # here we have [0...8) range
        while q == 0:                 # and dropping 0
            q = onebit() << 2 | twobits()
        return q
    
    counter = 8*[0]
    
    for i in range(500000):
        counter[rand7()] += 1
    
    print(counter)
    

    produced uniform in [1...8) sampling

    [0, 71592, 71352, 71071, 71543, 71600, 71388, 71454]
    

    Take two bits from one sample, one bit from another sample, combine them, some rejection and voila!