Search code examples
pythonarraysnumpymasking

Python/Numpy: Reduce two boolean arrays based on conditionals relating to both arrays


I have two boolean Numpy arrays of boolean indicators:

                          v                          v              v
A =    np.array([0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1], dtype=bool)
B =    np.array([1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1], dtype=bool)
                                         ^                 ^        ^

Moving from left to right, I would like to isolate the first true A indicator, then the next true B indicator, then the next true A indicator, then the next true B indicator, etc. to end up with:

                          v                          v              v
>>>> A_result = [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1]
     B_result = [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1]
                                         ^                 ^        ^

I have a feeling I could create a betweenAB array indicating all the places where A==1 is followed by B==1:

                          v                          v              v
betweenAB =     [0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1]
                                         ^                 ^        ^

then take the start and end indices of each run, but I am still somewhat of a beginner when it comes to Numpy and am not sure how I might do that.

I'm looking for a fully vectorized approach as there are thousands of these arrays in my application each containing thousands of elements. Any help would be much appreciated.


Solution

  • This can barely be done efficiently with Numpy (probably not possible efficiently without loops), but easily and efficiently with the Numba's JIT. This is mainly due to the rather sequential nature of the applied operation.

    Here is an example in Numba:

    import numpy as np
    import numba as nb
    
    nb.jit('UniTuple(bool[::1],2)(bool[::1],bool[::1])')
    def compute(A, B):
        assert len(A) == len(B)
        n = len(A)
        i = 0
        resA = np.zeros(n, dtype=bool)
        resB = np.zeros(n, dtype=bool)
        while i < n:
            while i < n and A[i] == 0:
                resA[i] = 0
                i += 1
            if i < n:
                resA[i] = 1
                if B[i] == 1:
                    resB[i] = 1
                    i += 1
                    continue
                i += 1
            while i < n and B[i] == 0:
                resB[i] = 0
                i += 1
            if i < n:
                resB[i] = 1
                i += 1
        return resA, resB