Search code examples
pythonpandasnumpypython-datetime

Numpy.isin not detecting presence of a datetime in a vector of datetime64[ns] elements, while equality works


I have an array of datetime objects d1. I would like to check if each of its elements are present in a series (a numpy array, a pandas Series, or whatever) of datetime objects d2.

One element of d1 is equal to one element of d2, and this element is in d2, but numpy.isin is not working here. What am I missing?

If any manipulation should be done, I'd prefer it to be done on d2 (d1 should be ideally unchanged).

import numpy as np
import datetime
import pandas as pd

d1 = np.array([datetime.datetime(2021, 1, 1, 0, 0),
               datetime.datetime(2021, 1, 1, 12, 0)])
d2 = pd.to_datetime(['2021-01-01 12:00:00', '2021-01-02 12:00:00'])
print(d1)
# [datetime.datetime(2021, 1, 1, 0, 0) datetime.datetime(2021, 1, 1, 12, 0)]
print(d2)
# DatetimeIndex(['2021-01-01 12:00:00', '2021-01-02 12:00:00'], 
#               dtype='datetime64[ns]', freq=None)


print(d1[1] == d2[0])
# True
print(d1[1] in d2)
# True
print(np.isin(d1, d2))
# [False False] # !!! What is happening here?
# Expected result:
# [False True]

Solution

  • For me working pandas solution for compare both DatetimeIndex by Index.isin:

    print(pd.DatetimeIndex(d1).isin(d2))
    [False  True]
    

    Like @juanpa.arrivillaga mentioned, for working np.isin need same types here, so convert d1 to datetimes and then compare

    print(d1.astype('datetime64[us]'))
    #another way of converting
    print(d1.astype('<M8[ns]'
    ['2021-01-01T00:00:00.000000' '2021-01-01T12:00:00.000000']
    

    print (np.isin(d1.astype('datetime64[us]'), d2))
    [False  True]
    
    print (np.isin(d1.astype('<M8[ns]'), d2))
    [False  True]