Search code examples
pythonarraysnumpysplit

Splitting numpy array according bool


I have an array like

a = np.array[ 4, 9, 3, 1, 6, 4, 7, 4, 2]

and a boolean array (so that's a mask) of same size like

boo = np.array[ True, True, False, False, True, True, True, False, True]

(boo can also start with a False as first entry...)

Now I want to split a into new arrays with 2 conditions:

  • a new sub array contains only values with True in boo
  • a new sub array begins always after a False and ends before a False.
    So a result would be [[4, 9], [6, 4, 7], [2]]

My idea is:
I know that I can use np.split as basic.
In this case it would be b = np.split(a, [2, 4, 7, 8] and afterwards I would only take eyery second element from b, starting with the first because my first element in boo is True.

So my problem is: How do I get the array [2, 4, 7, 8]?

(Looping with python is not an option, because it's too slow.)


Solution

  • Maybe this is fast enough:

    d = np.nonzero(boo != np.roll(boo, 1))[0]
    if d[0] == 0:
        d = d[1:]
    b = np.split(a, d)
    b = b[0::2] if boo[0] else b[1::2]
    

    Found a simpler and faster way:

    indices = np.nonzero(boo[1:] != boo[:-1])[0] + 1
    b = np.split(a, indices)
    b = b[0::2] if boo[0] else b[1::2]
    

    Comparing slices is at least twice as fast as np.roll() plus the if statement.
    Also, np.flatnonzero(...) would look nicer than np.nonzero(...)[0] but be slightly slower.