Search code examples
pythonpython-3.xmethodsbuilt-inpython-descriptors

Is there a built-in way to use CPython built-ins to make an arbitrary callable behave as an unbound class method?


In Python 2, it was possible to convert arbitrary callables to methods of a class. Importantly, if the callable was a CPython built-in implemented in C, you could use this to make methods of user-defined classes that were C layer themselves, invoking no byte code when called.

This is occasionally useful if you're relying on the GIL to provide "lock-free" synchronization; since the GIL can only be swapped out between op codes, if all the steps in a particular part of your code can be pushed to C, you can make it behave atomically.

In Python 2, you could do something like this:

import types
from operator import attrgetter
class Foo(object):
    ... This class maintains a member named length storing the length...

    def __len__(self):
        return self.length  # We don't want this, because we're trying to push all work to C

# Instead, we explicitly make an unbound method that uses attrgetter to achieve
# the same result as above __len__, but without no byte code invoked to satisfy it
Foo.__len__ = types.MethodType(attrgetter('length'), None, Foo)

In Python 3, there is no longer an unbound method type, and types.MethodType only takes two arguments and creates only bound methods (which is not useful for Python special methods like __len__, __hash__, etc., since special methods are often looked up directly on the type, not the instance).

Is there some way of accomplishing this in Py3 that I'm missing?

Things I've looked at:

  1. functools.partialmethod (appears to not have a C implementation, so it fails the requirements, and between the Python implementation and being much more general purpose than I need, it's slow, taking about 5 us in my tests, vs. ~200-300 ns for direct Python definitions or attrgetter in Py2, a roughly 20x increase in overhead)
  2. Trying to make attrgetter or the like follow the non-data descriptor protocol (not possible AFAICT, can't monkey-patch in a __get__ or the like)
  3. Trying to find a way to subclass attrgetter to give it a __get__, but of course, the __get__ needs to be delegated to C layer somehow, and now we're back where we started
  4. (Specific to attrgetter use case) Using __slots__ to make the member a descriptor in the first place, then trying to somehow convert from the resulting descriptor for the data into something that skips the final step of binding and acquiring the real value to something that makes it callable so the real value retrieval is deferred

I can't swear I didn't miss something for any of those options though. Anyone have any solutions? Total hackery is allowed; I recognize I'm doing pathological things here. Ideally it would be flexible (to let you make something that behaves like an unbound method out of a class, a Python built-in function like hex, len, etc., or any other callable object not defined at the Python layer). Importantly, it needs to attach to the class, not each instance (both to reduce per-instance overhead, and to work correctly for dunder special methods, which bypass instance lookup in most cases).


Solution

  • Found a (probably CPython only) solution to this recently. It's a little ugly, being a ctypes hack to directly invoke CPython APIs, but it works, and gets the desired performance:

    import ctypes
    from operator import attrgetter
    
    make_instance_method = ctypes.pythonapi.PyInstanceMethod_New
    make_instance_method.argtypes = (ctypes.py_object,)
    make_instance_method.restype = ctypes.py_object
    
    class Foo:
        # ... This class maintains a member named length storing the length...
    
        # Defines a __len__ method that, at the C level, fetches self.length
        __len__ = make_instance_method(attrgetter('length'))
    

    It's an improvement over the Python 2 version in one way, since, as it doesn't need the class to be defined to make an unbound method for it, you can define it in the class body by simple assignment (where the Python 2 version must explicitly reference Foo twice in Foo.__len__ = types.MethodType(attrgetter('length'), None, Foo), and only after class Foo has finished being defined).

    On the other hand, it doesn't actually provide a performance benefit on CPython 3.7 AFAICT, at least not for the simple case here where it's replacing def __len__(self): return self.length; in fact, for __len__ accessed via len(instance) on an instance of Foo, ipython %%timeit microbenchmarks show len(instance) is ~10% slower when __len__ is defined via __len__ = make_instance_method(attrgetter('length')), . This is likely an artifact of attrgetter itself having slightly higher overhead due to CPython not having moved it to the "FastCall" protocol (called "Vectorcall" in 3.8 when it was made semi-public for provisional third-party use), while user-defined functions already benefit from it in 3.7, as well as having to dynamically choose whether to perform dotted or undotted attribute lookup and single or multiple attribute lookup each time (which Vectorcall might be able to avoid by choosing a __call__ implementation appropriate to the gets being performed at construction time) adds more overhead that the plain method avoids. It should win for more complicated cases (say, if the attribute to be retrieved is a nested attribute like self.contained.length), since attrgetter's overhead is largely fixed, while nested attribute lookup in Python means more byte code, but right now, it's not useful very often.

    If they ever get around to optimizing operator.attrgetter for Vectorcall, I'll rebenchmark and update this answer.