Search code examples
pythonarraysnumpyrowdeleting

Deleting rows in np array inside a for loop


I am trying to delete all rows in which there is one or less non-zero elements, in multiple 2D arrays contained within the list 'a'.

This method works when I run it outside the 'i' loop, but does not as a whole. I know that I cannot delete rows over which I am iterating, but I believe that I am not doing so in this case, because I am only deleting rows in arrays contained in a, not the arrays themselves.

for i in range(len(a)):
  del_idx=[]
  for j in range(len(a[i])):
    nonzero=np.nonzero(a[i][j])
    nonzero_len=len(nonzero[0]) #because np.nonzero outputs a tuple
    if nonzero_len<=1:
        del_idx.append(j)
    else:
        continue
  np.delete(a[i],(del_idx),axis=0)

Anyone know what's going on here? If this really does not work, how can I delete these elements without using a loop? This is Python 2.7

Thank you!


Solution

  • You should aim to avoid for loops with NumPy when vectorised operations are available. Here, for example, you can use Boolean indexing:

    import numpy as np
    
    np.random.seed(0)
    
    A = np.random.randint(0, 2, (10, 3))
    
    res = A[(A != 0).sum(1) > 1]
    
    array([[0, 1, 1],
           [0, 1, 1],
           [1, 1, 1],
           [1, 1, 0],
           [1, 1, 0],
           [0, 1, 1],
           [1, 1, 0]])
    

    The same logic can be applied for each array within your list of arrays.