Search code examples
pythoncpythonpython-internals

Python: equality for Nan in a list?


I just want to figure out the logic behind these results:

>>>nan = float('nan')
>>>nan == nan
False 
# I understand that this is because the __eq__ method is defined this way
>>>nan in [nan]
True 
# This is because the __contains__ method for list is defined to compare the identity first then the content?

But in both cases I think behind the scene the function PyObject_RichCompareBool is called right? Why there is a difference? Shouldn't they have the same behaviour?


Solution

  • But in both cases I think behind the scene the function PyObject_RichCompareBool is called right? Why there is a difference? Shouldn't they have the same behaviour?

    == never calls PyObject_RichCompareBool on the float objects directly, floats have their own rich_compare method(called for __eq__) that may or may not call PyObject_RichCompareBool depending on the the arguments passed to it.

     /* Comparison is pretty much a nightmare.  When comparing float to float,
     * we do it as straightforwardly (and long-windedly) as conceivable, so
     * that, e.g., Python x == y delivers the same result as the platform
     * C x == y when x and/or y is a NaN.
     * When mixing float with an integer type, there's no good *uniform* approach.
     * Converting the double to an integer obviously doesn't work, since we
     * may lose info from fractional bits.  Converting the integer to a double
     * also has two failure modes:  (1) a long int may trigger overflow (too
     * large to fit in the dynamic range of a C double); (2) even a C long may have
     * more bits than fit in a C double (e.g., on a a 64-bit box long may have
     * 63 bits of precision, but a C double probably has only 53), and then
     * we can falsely claim equality when low-order integer bits are lost by
     * coercion to double.  So this part is painful too.
     */
    
    static PyObject*
    float_richcompare(PyObject *v, PyObject *w, int op)
    {
        double i, j;
        int r = 0;
    
        assert(PyFloat_Check(v));
        i = PyFloat_AS_DOUBLE(v);
    
        /* Switch on the type of w.  Set i and j to doubles to be compared,
         * and op to the richcomp to use.
         */
        if (PyFloat_Check(w))
            j = PyFloat_AS_DOUBLE(w);
    
        else if (!Py_IS_FINITE(i)) {
            if (PyInt_Check(w) || PyLong_Check(w))
                /* If i is an infinity, its magnitude exceeds any
                 * finite integer, so it doesn't matter which int we
                 * compare i with.  If i is a NaN, similarly.
                 */
                j = 0.0;
            else
                goto Unimplemented;
        }
    ...
    

    On the other hand the list_contains directly calls PyObject_RichCompareBool on the items hence you get True in the second case.


    Note that this is true only for CPython, PyPy's list.__contains__ method only seems to be comparing the items by calling their __eq__ method:

    $~/pypy-2.4.0-linux64/bin# ./pypy
    Python 2.7.8 (f5dcc2477b97, Sep 18 2014, 11:33:30)
    [PyPy 2.4.0 with GCC 4.6.3] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>>> nan = float('nan')
    >>>> nan == nan
    False
    >>>> nan is nan
    True
    >>>> nan in [nan]
    False