Search code examples
pythonpython-3.xpython-datamodel

Why is the __init__ method of Counter referred to as a descriptor?


I was reading the __init__ method of the Counter class, and saw this:

if not args:
    TypeError("descriptor '__init__' of 'Counter' object "
              "needs an argument")

I wasn't sure what it meant by descriptor, so I checked the python data model document and found this:

In general, a descriptor is an object attribute with “binding behavior”, one whose attribute access has been overridden by methods in the descriptor protocol: __get__(), __set__(), and __delete__(). If any of those methods are defined for an object, it is said to be a descriptor.

None of those methods seem to be present in the class definition, so why is __init_ referred to as a descriptor?


Solution

  • In python, all functions are descriptors (including __init__). This is actually how they know what self is when they're used in a class. For example, I can define a function (foo) and then when I look at it's methods, I'll see that foo has a __get__ method which makes it adhere to the descriptor protocol:

    >>> def foo():
    ...   pass
    ... 
    >>> dir(foo)
    ['__annotations__', '__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__get__', '__getattribute__', '__globals__', '__gt__', '__hash__', '__init__', '__kwdefaults__', '__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']
    >>> '__get__' in dir(foo)
    True
    

    So the terminology used there is at least accurate. It could be debated whether that's the best term to use...

    I might have called it a "Bound method" instead of a descriptor, but in python3.x, the distinction between regular functions, bound methods and unbound methods becomes a little more muddy (unbound methods are regular functions in python3.x)...


    Of course, I could use a different type of descriptor to initialize my Counter subclass ...

    class MyDescriptor(object):
        def __get__(self, inst, cls):
            # This is a really useless descriptor!
            return Counter.__init__.__get__(inst, cls)
    
    class MyCounter(Counter):
        __init__ = MyDescriptor()
    

    and throw an error, then the error message would be more accurate that way, but this is a pretty crazy case that I wouldn't expect to happen very frequently.

    To really know what Raymond was thinking when he wrote that code, I suppose you'd have to ask him (or go spelunking in the hg commit history and hope he mentioned it in a commit message).