Search code examples
pythonrandomdistinct-values

How to generate list of unique random floats in Python


I know that there are easy ways to generate lists of unique random integers (e.g. random.sample(range(1, 100), 10)).

I wonder whether there is some better way of generating a list of unique random floats, apart from writing a function that acts like a range, but accepts floats like this:

import random

def float_range(start, stop, step):
    vals = []
    i = 0
    current_val = start
    while current_val < stop:
        vals.append(current_val)
        i += 1
        current_val = start + i * step
    return vals

unique_floats = random.sample(float_range(0, 2, 0.2), 3)

Is there a better way to do this?


Solution

  • Answer

    One easy way is to keep a set of all random values seen so far and reselect if there is a repeat:

    import random
    
    def sample_floats(low, high, k=1):
        """ Return a k-length list of unique random floats
            in the range of low <= x <= high
        """
        result = []
        seen = set()
        for i in range(k):
            x = random.uniform(low, high)
            while x in seen:
                x = random.uniform(low, high)
            seen.add(x)
            result.append(x)
        return result
    

    Notes

    • This technique is how Python's own random.sample() is implemented.

    • The function uses a set to track previous selections because searching a set is O(1) while searching a list is O(n).

    • Computing the probability of a duplicate selection is equivalent to the famous Birthday Problem.

    • Given 2**53 distinct possible values from random(), duplicates are infrequent. On average, you can expect a duplicate float at about 120,000,000 samples.

    Variant: Limited float range

    If the population is limited to just a range of evenly spaced floats, then it is possible to use random.sample() directly. The only requirement is that the population be a Sequence:

    from __future__ import division
    from collections import Sequence
    
    class FRange(Sequence):
        """ Lazily evaluated floating point range of evenly spaced floats
            (inclusive at both ends)
    
            >>> list(FRange(low=10, high=20, num_points=5))
            [10.0, 12.5, 15.0, 17.5, 20.0]
    
        """
        def __init__(self, low, high, num_points):
            self.low = low
            self.high = high
            self.num_points = num_points
    
        def __len__(self):
            return self.num_points
    
        def __getitem__(self, index):
            if index < 0:
                index += len(self)
            if index < 0 or index >= len(self):
                raise IndexError('Out of range')
            p = index / (self.num_points - 1)
            return self.low * (1.0 - p) + self.high * p
    

    Here is a example of choosing ten random samples without replacement from a range of 41 evenly spaced floats from 10.0 to 20.0.

    >>> import random
    >>> random.sample(FRange(low=10.0, high=20.0, num_points=41), k=10)
    [13.25, 12.0, 15.25, 18.5, 19.75, 12.25, 15.75, 18.75, 13.0, 17.75]