You can use numpy.intersect1d(a1,a2) and then the docs provide an option to intersect multiple arrays :
reduce(np.intersect1d, ([1, 3, 4, 3], [3, 1, 2, 1], [6, 3, 4, 2]))
What I want to do is to find the intersection between a 1D array and every row in the corresponding 2D array.
Or better yet just the COUNT of the overlapping elements in every row.
I know I can do that with intersect1d() and a loop, but it will be too slow.
How can we count the overlapping elements in every row the numpy-way ?
Ex:
In [59]: a2 = np.random.choice(np.arange(0,100),(10,5), replace=False)
In [60]: a2
Out[60]:
array([[50, 5, 25, 40, 19], 1
[43, 37, 21, 55, 11], 0
[16, 49, 6, 86, 96], 0
[80, 66, 87, 51, 64], 0
[42, 7, 20, 24, 74], 1
[92, 63, 75, 54, 90], 2
[ 9, 91, 88, 85, 22], 0
[ 4, 65, 97, 93, 53], 0
[18, 0, 57, 71, 76], 0
[94, 1, 77, 89, 45]]) 0
In [61]: a1 = np.random.choice(np.arange(0,100),5, replace=False)
In [63]: a1
Out[63]: array([63, 54, 20, 60, 25])
To simply get the count of common elements per row, we can get a mask of matches with np.isin
and then just the count per row -
np.isin(arr2D,arr1D).sum(axis=1)
If you want to count each unique element only once in case of duplicate occurences per row and if input elements are positive numbers, we need few more steps -
# https://stackoverflow.com/a/46256361/ @Divakar
def bincount2D_vectorized(a):
N = a.max()+1
a_offs = a + np.arange(a.shape[0])[:,None]*N
return np.bincount(a_offs.ravel(), minlength=a.shape[0]*N).reshape(-1,N)
count = (bincount2D_vectorized(np.isin(arr2D,arr1D)*arr2D)[:,1:]!=0).sum(1)