Search code examples
pythonperformancenumpyziplist-comprehension

Multiple ranges / np.arange


I have two end-point arrays that look like this:

t1 = np.array([0,13,22,...,99994])
t2 = np.array([4,14,25,...,99998])

I am looking for the most efficient way to generate an output that looks like this:

np.array([0,1,2,3,4,13,14,22,23,24,25,...,99994,99995,99996,99997,99998])

one way to do it is this:

np.array([i for a, b in zip(t1, t2) for i in range(a, b + 1)])

This solution is slow and I am certain that it can still be vastly improved by entirely replacing the zip and list comprehension combo with some functions entirely in Numpy, it is just that I don't know how. Can you guys show me the most efficient way to do it?

Thank you guys in advance


Code to generate these two arrays:

import numpy as np

m =10000
Z = np.arange(0,10*m,10)

t1 = np.random.randint(5, size =m ) + Z
t2 =np.random.randint(5,size = m) + 5 + Z

Solution

  • Here's a performant approach using numba:

    from numba import njit
    
    @njit
    def n_ranges_nb(t1, t2):
        a = np.arange(np.max(t2)+1)
        n = (t2 - t1).sum()
        out = np.zeros(n)
        l, l_old = 0, 0
        for i,j in zip(t1, t2):
            l += j-i
            out[l_old:l] = a[i:j]
            l_old = l
        return out
    

    Checking with the same values as above:

    t1 = np.array([0,13,22])
    t2 = np.array([4,14,25])
    
    n_ranges_nb(t1, t2+1)
    # array([ 0.,  1.,  2.,  3.,  4., 13., 14., 22., 23., 24., 25.])
    

    Lets check the timings:

    d = 100
    perfplot.show(
        setup=lambda n: np.cumsum(np.random.randint(0, 50, n)),
    
        kernels=[
            lambda x: np.array([i for a, b in zip(x,x+d) for i in range(a,b+1)]),
            lambda x: n_ranges_nb(x, x+d+1),
            lambda x: create_ranges(x, x+d+1) # (from the dupe)
        ],
    
        labels=['nested-list-comp', 'n_ranges_nb', 'create_ranges'],
        n_range=[2**k for k in range(0, 18)],
        xlabel='N'
    )
    

    enter image description here