Search code examples
pythonlanguage-designlanguage-history

Python: Why does the int class not have rich comparison operators like `__lt__()`?


Mostly curious.

I've noticed (at least in py 2.6 and 2.7) that a float has all the familiar rich comparison functions: __lt__(), __gt__, __eq__, etc.

>>> (5.0).__gt__(4.5)
True

but an int does not

>>> (5).__gt__(4)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
AttributeError: 'int' object has no attribute '__gt__'

Which is odd to me, because the operator itself works fine

>>> 5 > 4
True

Even strings support the comparison functions

>>> "hat".__gt__("ace")
True

but all the int has is __cmp__()

Seems strange to me, and so I was wondering why this came to be.

Just tested and it works as expected in python 3, so I am assuming some legacy reasons. Still would like to hear a proper explanation though ;)


Solution

  • If we look at the PEP 207 for Rich Comparisions there is this interesting sentence right at the end:

    The inlining already present which deals with integer comparisons would still apply, resulting in no performance cost for the most common cases.

    So it seems that in 2.x there is an optimisation for integer comparison. If we take a look at the source code we can find this:

    case COMPARE_OP:
        w = POP();
        v = TOP();
        if (PyInt_CheckExact(w) && PyInt_CheckExact(v)) {
            /* INLINE: cmp(int, int) */
            register long a, b;
            register int res;
            a = PyInt_AS_LONG(v);
            b = PyInt_AS_LONG(w);
            switch (oparg) {
            case PyCmp_LT: res = a <  b; break;
            case PyCmp_LE: res = a <= b; break;
            case PyCmp_EQ: res = a == b; break;
            case PyCmp_NE: res = a != b; break;
            case PyCmp_GT: res = a >  b; break;
            case PyCmp_GE: res = a >= b; break;
            case PyCmp_IS: res = v == w; break;
            case PyCmp_IS_NOT: res = v != w; break;
            default: goto slow_compare;
            }
            x = res ? Py_True : Py_False;
            Py_INCREF(x);
        }
        else {
          slow_compare:
            x = cmp_outcome(oparg, v, w);
        }
    

    So it seems that in 2.x there was an existing performance optimisation - by allowing the C code to compare integers directly - which would not have been preserved if the rich comparison operators had been implemented.

    Now in Python 3 __cmp__ is no longer supported so the rich comparison operators must there. Now this does not cause a performance hit as far as I can tell. For example, compare:

    Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:05) 
    [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import timeit
    >>> timeit.timeit("2 < 1")
    0.06980299949645996
    

    to:

    Python 3.2.3 (v3.2.3:3d0686d90f55, Apr 10 2012, 11:25:50) 
    [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import timeit
    >>> timeit.timeit("2 < 1")
    0.06682920455932617
    

    So it seems that similar optimisations are there but my guess is the judgement call was that putting them all in the 2.x branch would have been too great a change when backwards compatibility was a consideration.

    In 2.x if you want something like the rich comparison methods you can get at them via the operator module:

    >>> import operator
    >>> operator.gt(2,1)
    True