Search code examples
pythonnumpylabel

Setting some 2d array labels to zero in python


My goal is to set some labels in 2d array to zero without using a for loop. Is there a faster numpy way to do this without the for loop? The ideal scenario would be temp_arr[labeled_im not in labels] = 0, but it's not really working the way I'd like it to.

labeled_array = np.array([[1,2,3],
                          [4,5,6],
                          [7,8,9]])

labels = [2,4,5,6,8]
temp_arr = np.zeros((labeled_array.shape)).astype(int)
for label in labels:
    temp_arr[labeled_array == label] = label

>> temp_arr
[[0 2 0]
 [4 5 6]
 [0 8 0]]

The for loop gets quite slow when there are a lot of iterations to go through, so it is important to improve the execution time with numpy.


Solution

  • You can use define labels as a set and use temp_arr = np.where(np.isin(labeled_array, labels), labeled_array, 0). Although, the difference for such a small array does not seem to be significant.

    import numpy as np
    import time
    
    labeled_array = np.array([[1,2,3],
                              [4,5,6],
                              [7,8,9]])
    
    labels = [2,4,5,6,8]
    
    start = time.time()
    temp_arr_0 = np.zeros((labeled_array.shape)).astype(int)
    for label in labels:
        temp_arr_0[labeled_array == label] = label
    end = time.time()
    
    print(f"Loop takes {end - start}")
    
    start = time.time()
    temp_arr_1 = np.where(np.isin(labeled_array, labels), labeled_array, 0)
    end = time.time()
    
    print(f"np.where takes {end - start}")
    
    labels  = {2,4,5,6,8}
    
    start = time.time()
    temp_arr_2 = np.where(np.isin(labeled_array, labels), labeled_array, 0)
    end = time.time()
    
    print(f"np.where with set takes {end - start}")
    

    outputs

    Loop takes 5.3882598876953125e-05
    np.where takes 0.00010514259338378906
    np.where with set takes 3.314018249511719e-05