Search code examples
pythonpython-3.xpandasnannonetype

Pandas/Numpy NaN None comparison


In Python Pandas and Numpy, why is the comparison result different?

from pandas import Series
from numpy import NaN

NaN is not equal to NaN

>>> NaN == NaN
False

but NaN inside a list or tuple is

>>> [NaN] == [NaN], (NaN,) == (NaN,)
(True, True)

While Series with NaN are not equal again:

>>> Series([NaN]) == Series([NaN])
0    False
dtype: bool

And None:

>>> None == None, [None] == [None]
(True, True)

While

>>> Series([None]) == Series([None])
0    False
dtype: bool 

This answer explains the reasons for NaN == NaN being False in general, but does not explain its behaviour in python/pandas collections.


Solution

  • As explained here, and here and in python docs to check sequence equality

    element identity is compared first, and element comparison is performed only for distinct elements.

    Because np.nan and np.NaN refer to the same object i.e. (np.nan is np.nan is np.NaN) == True this equality holds [np.nan] == [np.nan], but on the other hand float('nan') function creates a new object on every call so [float('nan')] == [float('nan')] is False.

    Pandas/Numpy do not have this problem:

    >>> pd.Series([np.NaN]).eq(pd.Series([np.NaN]))[0], (pd.Series([np.NaN]) == pd.Series([np.NaN]))[0]
    (False, False)
    

    Although special equals method treats NaNs in the same location as equals.

    >>> pd.Series([np.NaN]).equals(pd.Series([np.NaN]))
    True
    

    None is treated differently. numpy considers them equal:

    >>> pd.Series([None, None]).values == (pd.Series([None, None])).values
    array([ True,  True])
    

    While pandas does not

    >>> pd.Series([None, None]) == (pd.Series([None, None]))
    0    False
    1    False
    dtype: bool
    

    Also there is an inconsistency between == operator and eq method, which is discussed here:

    >>> pd.Series([None, None]).eq(pd.Series([None, None]))
    0    True
    1    True
    dtype: bool
    

    Tested on pandas: 0.23.4 numpy: 1.15.0