Search code examples
pythonperformancetimemaxlist-comprehension

Is there any option to improve the time efficiency of the this data normalization any further?


I have a matrix named tArray with shape (11, 512) and want to normalize the values in it. I see that the np.max() costs a lot time but I didn't find any option to improve it any further. Can the time efficiency of this following line of code be improved?:

tArray = np.array([[val/tArray[i][sqLen-1] for val in tArray[i]] if i not in [1,2] else [val/np.max(tArray[i][:sqLen-1]) for val in tArray[i]] for i in range(len(tArray))])

to reproduce:

tArray = np.random.randint(1, 100, size=(11, 512))
tArray = np.array([[val/tArray[i][512-1] for val in tArray[i]] if i not in [1,2] else [val/np.max(tArray[i][:512-1]) for val in tArray[i]] for i in range(len(tArray))])```

Solution

  • Here is ~180X speedup improvement approach:

    Note, for the shape of your input array [512-1] is the same as [-1] (last column) and [:512-1] is the same as [:-1].

    The main condition of your loop if i not in [1,2] else tells that the aggregations/calculations are implied for exactly 3 slices: [0] (first row), [1:3] (rows 1 and 2) and the remaining rows [3:].

    So instead of iterating over each row and recalculating each column we can apply the needed operations for 3 sequential slices at once in vectorized manner and eventually concatenate the results with np.vstack routine:

    np.vstack((tArray[0]/tArray[0,-1], tArray[1:3]/tArray[1:3,:-1].max(1)[:,None], tArray[3:]/tArray[3:,-1][:,None]))
    

    Let's see on measurements:

    tArray = np.random.randint(1, 100, size=(11, 512)) # input array
    

    In [165]: %timeit tArray1 = np.array([[val/tArray[i][512-1] for val in tArray[i]] if i not in [1,2] else [val/np.max
         ...: (tArray[i][:512-1]) for val in tArray[i]] for i in range(len(tArray))])
    4.54 ms ± 23.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    
    In [171]: %timeit new_arr = np.vstack((tArray[0]/tArray[0,-1], tArray[1:3]/tArray[1:3,:-1].max(1)[:,None], tArray[3:]/tAr
         ...: ray[3:,-1][:,None]))
    25.5 µs ± 264 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    

    Of course, tArray1 and new_arr have the same content:

    In [173]: tArray1
    Out[173]: 
    array([[ 8.11111111,  2.33333333,  9.33333333, ...,  0.44444444,
             5.22222222,  1.        ],
           [ 0.76767677,  0.77777778,  0.72727273, ...,  0.58585859,
             0.29292929,  0.09090909],
           [ 0.36363636,  0.85858586,  0.35353535, ...,  0.06060606,
             0.48484848,  0.55555556],
           ...,
           [ 1.875     ,  2.04166667,  0.29166667, ...,  0.20833333,
             0.58333333,  1.        ],
           [ 0.28735632,  0.11494253,  0.37931034, ...,  0.50574713,
             0.74712644,  1.        ],
           [ 5.625     , 10.5       ,  0.5       , ...,  2.125     ,
             0.75      ,  1.        ]])
    
    In [174]: new_arr
    Out[174]: 
    array([[ 8.11111111,  2.33333333,  9.33333333, ...,  0.44444444,
             5.22222222,  1.        ],
           [ 0.76767677,  0.77777778,  0.72727273, ...,  0.58585859,
             0.29292929,  0.09090909],
           [ 0.36363636,  0.85858586,  0.35353535, ...,  0.06060606,
             0.48484848,  0.55555556],
           ...,
           [ 1.875     ,  2.04166667,  0.29166667, ...,  0.20833333,
             0.58333333,  1.        ],
           [ 0.28735632,  0.11494253,  0.37931034, ...,  0.50574713,
             0.74712644,  1.        ],
           [ 5.625     , 10.5       ,  0.5       , ...,  2.125     ,
             0.75      ,  1.        ]])