Search code examples
pythonarraysnumpymasked-array

Comparing two numpy arrays for compliance with two conditions


Consider two numpy arrays having the same shape, A and B, composed of 1s and 0s. A small example is shown:

A = [[1 0 0 1]         B = [[0 0 0 0]
     [0 0 1 0]              [0 0 0 0]
     [0 0 0 0]              [1 1 0 0]
     [0 0 0 0]              [0 0 1 0]
     [0 0 1 1]]             [0 1 0 1]]

I now want to assign values to the two Boolean variables test1 and test2 as follows:

test1: Is there at least one instance where a 1 in an A column and a 1 in the SAME B column have row differences of exactly 1 or 2? If so, then test1 = True, otherwise False.

In the example above, column 0 of both arrays have 1s that are 2 rows apart, so test1 = True. (there are other instances in column 2 as well, but that doesn't matter - we only require one instance.)

test2: Do the 1 values in A and B all have different array addresses? If so, then test2 = True, otherwise False.

In the example above, both arrays have [4,3] = 1, so test2 = False.

I'm struggling to find an efficient way to do this and would appreciate some assistance.


Solution

  • Here is a simple way to test if two arrays have an entry one element apart in the same column (only in one direction):

    (A[1:, :] * B[:-1, :]).any(axis=None)
    

    So you can do

    test1 = (A[1:, :] * B[:-1, :] + A[:-1, :] * B[1:, :]).any(axis=None) or (A[2:, :] * B[:-2, :] + A[:-2, :] * B[2:, :]).any(axis=None)
    

    The second test can be done by converting the locations to indices, stacking them together, and using np.unique to count the number of duplicates. Duplicates can only come from the same index in two arrays since an array will never have duplicate indices. We can further speed up the calculation by using flatnonzero instead of nonzero:

    test2 = np.all(np.unique(np.concatenate((np.flatnonzero(A), np.flatnonzero(B))), return_counts=True)[1] == 1)
    

    A more efficient test would use np.intersect1d in a similar manner:

    test2 = not np.intersect1d(np.flatnonzero(A), np.flatnonzero(B)).size