Search code examples
pythonnumpyrandomscipyprobability

How to generate in python a random number in a range but biased toward some specific numbers?


I would like to choose a range, for example, 60 to 80, and generate a random number from it. However, between 65-72 I'd like a higher probability, while the other ranges aside from this (60-64 and 73 to 80) to have lower.

An example:

From 60-64 there's 35% chance of being choosen as well for 73-80. From 65-72 65% chance.

The elements in the subranges are equally likely. I'm generating integers.

Also, it would be interesting a scalable solution, so that one could expand its usage for higher ranges, for example, 1000-2000, but biased toward 1400-1600.

Does anyone could help with some ideas?

Thanks beforehand for anyone willing to contribute!


Solution

  • For equally likely outcomes in the subranges, the following will do the trick:

    import random
    
    THRESHOLD = [0.65, 0.65 + 0.35 * 5 / 13]
    
    def my_distribution():
        u = random.random()
        if u <= THRESHOLD[0]:
            return random.randint(65, 72)
        elif u <= THRESHOLD[1]:
            return random.randint(60, 64)
        else:
            return random.randint(73, 80)
    

    This uses a uniform random number to decide which subrange you're in, then generates values equally likely within that subrange.

    The THRESHOLD values are similar to a cumulative distribution function, but arranged so the most likely outcome is checked first. 65% of the time (u <= THRESHOLD[0]) you'll generate from the range [65, 72]. Failing that, 5 of the 13 remaining possibilities (5/13 of 35%) are in the range [60, 64], and the rest are in the range [73, 80]. A Uniform(0,1) value u will fall below the first threshold 65% of the time, and failing that, below the second threshold 5/13 of the time and above that threshold the remaining 8/13 of the time.

    The results look like this:

    Histogram of podium