Search code examples
pythonpython-multithreading

Threading a random number generator function in python


I have the following code in python:

results=[]
for i in range(1,7000000):
    a=(random.sample(range(1, 45), 6))
    results.append(a)

Is there a way I can use threading or any other method to make this code run faster? At the moment its just taking forever,over 20 minutes.


Solution

  • There's not a lot to be gained from threading here, thanks to the GIL, but this is a problem that can be solved with numpy, which can perform the work entirely at the C layer, saving a ton of time and memory to boot. A 7M by 6 sized 2D array with values in the given range can be created in less than a second with:

    import numpy as np
    
    results = np.random.randint(1, 45, (7000000, 6), np.uint8)
    

    This going to be faster in general, and much more memory efficient; a 7M long list of six-tuples will (on a 64 bit version of Python) occupy an absolute minimum of around 700 MB (likely more, given allocator overhead). The numpy array will occupy about 40 MB. It's also pretty easy to show that the creation of this list with all the inner tuples has unavoidable costs; microbenchmarking the numpy array alone shows that all the random number generation only takes around 420 ms, but converting from the numpy array to a list of six-tuples in the most efficient way brings the cost up to 12.5 seconds; if your machine is similar to mine, that's essentially a cap on the performance of any pure Python solution, because it's the raw cost paid by Python to create the tuples and populate the list:

    >>> %timeit -r5 arr = np.random.randint(1, 45, (7000000, 6), np.uint8)
    420 ms ± 875 µs per loop (mean ± std. dev. of 5 runs, 1 loop each)
    
    >>> %timeit -r5 arr = list(map(tuple, np.random.randint(1, 45, (7000000, 6), np.uint8)))
    12.5 s ± 254 ms per loop (mean ± std. dev. of 5 runs, 1 loop each)
    

    np.random.randint(1, 45, (7000000, 6), np.uint8).tolist() goes faster the list(map(tuple, ...)) (takes around 2.5 seconds), but again, that's only possible because of the C level accelerator assistance (and it would use even more memory, thanks to lists being somewhat less memory efficient).

    Without numpy, the best I can suggest is to avoiding recreating the range over and over, by creating it once outside the loop and reusing it, e.g.:

    choices = tuple(range(1, 45))  # tuple is generally the fastest structure to index
    results = []
    for i in range(1, 7000000):
        a = random.sample(choices, 6)
        results.append(a)
    

    That's not likely to save a whole lot though; the random module does an awful lot of Python level work wrapping 1-2 C level randomness generators, and that Python level work is going to be much slower than anything a fully accelerated C module could do.