Search code examples
pythonnumpymatchintersect

Python find values in numpy array2 based on values in array1


Let's say I have three arrays:

array1 = ['1', '2', '3', '2']
array2 = ['6', '2', '7', '3']
array3 = ['a', 'b', 'c', 'd']

I would like to iterate through each element in array1 and return the value that matches if and only if it is in array2. In other words, I would like the output to be:

array2_in_1 = ['1' '3' '1']

Because '1' is not in array2, so that is not accounted for.

'2' is in the first index of array2, so that index is recorded as '1'

'3' is in the 3rd index of array2, so that is recorded as '3'

'2' is recorded again at the first index of array2, so repeat '1'

Using these indices, I would then like to access the elements based on these indices from array3:

array3_in_2 = ['b' 'd' 'b']

Using commands like intersect only returns the unique values, so any duplicates are removed (which is not what is wanted here).

I have something along the lines of:

vals = []
for i in array1:
    idx = np.where(array2 == i)
    if len(vals) == 0:
        vals = array2[idx]
    else:
        vals = np.concatenate((vals, array2[idx]))

which gives the correct answer, but is there a more pythonic or quicker way to do this?


Solution

  • You can do this without looping:

    def ind():
        array1 = np.array(['1', '2', '3', '2'])
        array2 = np.array(['6', '2', '7', '3'])
        array3 = np.array(['a', 'b', 'c', 'd'])     
        return array3[(array1 == array2[:,np.newaxis]).T.nonzero()[1]]
    

    Results:

    >>> ind()
    array(['b', 'd', 'b'], 
          dtype='<U1')