Search code examples
pythonnumpydasknumba

Fast way to generate large-scale random ndarray


I want to generate a random matrix of shape (1e7, 800). But I find numpy.random.rand() becomes very slow at this scale. Is there a quicker way?


Solution

  • A simple way to do that is to write a multi-threaded implementation using Numba:

    import numba as nb
    import random
    
    @nb.njit('float64[:,:](int_, int_)', parallel=True)
    def genRandom(n, m):
        res = np.empty((n, m))
    
        # Parallel loop
        for i in nb.prange(n):
            for j in range(m):
                res[i, j] = np.random.rand()
    
        return res
    

    This is 6.4 times faster than np.random.rand() on my 6-core machine.

    Note that using 32-bit floats may help to speed up a bit the computation although the precision will be lower.