Search code examples
pythonc++cpythonpycxx

Documentation for PyCFunction_New / PyCFunction_NewEx


I'm struggling to understand some PyCXX code (C++ Python wrapper) that revolves around PyCFunction_New.

Could someone explain how this function works?

(I can't figure it out from the CPython source code.)


Here I will detail the problem I'm having. I've ruled a line above, because this probably won't be of such general use.

Reason for asking is that I'm dealing with weird code. I've got a keyword-method handler function:

    static PyObject* keyword_handler( PyObject* _self_and_name_tuple, 
                                      PyObject* _args, 
                                      PyObject* _keywords ) { }

It is getting stored as:

PyMethodDef meth_def_ext;
meth_def_ext.ml_meth = reinterpret_cast<PyCFunction>( _handler );
meth_def.ml_flags = METH_VARARGS | METH_KEYWORDS;

Then it is getting bundled into a PyCFunction_New:

        MethodDefExt<T>* method_def_ext = ...;

        Tuple args{2}; // Tuple wraps a CPython Tuple
        args[0] = Object{ this };
        args[1] = Object{ PyCapsule_New( (void*)method_def_ext, nullptr, nullptr ), true };

        PyObject* func = PyCFunction_New( & method_def_ext->meth_def, args.ptr() );

        return Object(func, true);
    }

Am I right in assuming that CPython will take care of typecasting it back to a 3-param function, where the first param is args (which matches the handler's _self_and_name_tuple first param)?

And CPython would only know from the fact that it is having to parse: 'myFunc(7, a=1)' that it is in fact dealing with a keywords a.k.a. 3-param function?

This doesn't look right.

Maybe CPython is typecasting args1 back to a PyMethodDef, and then inspecting it's .ml_flags

If that's happening then I need to know, because the code I'm working with simply has:

template<class T>
class MethodDefExt //: public PyMethodDef <-- I commented this out
{
    // ... Constructors ...

    PyMethodDef               meth_def;

    method_noargs_function_t  ext_noargs_function  = nullptr;
    method_varargs_function_t ext_varargs_function = nullptr;
    method_keyword_function_t ext_keyword_function = nullptr;

    Object                    py_method;
};

In its original form, I think it must have had two copies of PyMethodDef And the first one never got touched because it was the base class

If this is really happening, i.e. If this class is indeed getting typecast back to PyMethodDef by the internals of PyCFunction_New, then this is dodgy.

Surely someone could add a member variable at the front of MethodDefExt, and then the typecasting would break. This is flimsy...


The class I am dealing with allows the future C++ coder to implement a custom Python Type, and within this type, to implement methods that can be called from Python.

So they derive MyExt : CustomExt and write the method:

// one of these three
MyExt::foo(){...} 
MyExt::foo(PyObject* args){...}
MyExt::foo(PyObject* args, PyObject* kw){...}

Now they have to store this method in a lookup, by calling the appropriate one of these three functions:

    typedef Object (T::*method_noargs_function_t)();
    static void add_noargs_method( const char* name, 
                                   method_noargs_function_t function ) {
        lookup()[std::string{name}] = new MethodDefExt<T> 
                                   {name,function,noargs_handler,doc};
    }

    typedef Object (T::*method_varargs_function_t)( const Tuple& args );
    static void add_varargs_method( const char* name, 
                                    method_varargs_function_t function ) {
        lookup()[std::string{name}] = new MethodDefExt<T> 
                                    {name,function,varargs_handler,doc};
    }

    typedef Object (T::*method_keyword_function_t)( const Tuple& args, const Dict& kws );
    static void add_keyword_method( const char* name, 
                                    method_keyword_function_t function ) {
        lookup()[std::string{name}] = new MethodDefExt<T> 
                                    {name,function,keyword_handler,doc};
    }

Notice there is an associated handler function for each. These handler functions are static methods of CustomExt -- because a pointer to a static method can be called from CPython, i.e. It is just a standard C style function pointer.

So when Python wants the pointer for this foo function, we intercept here:

    // turn a name into function object
    virtual Object getattr_methods( const char* _name )
    {
        std::string name{ _name };

        // see if name exists and get entry with method
        auto i = lookup().find( name );

        DBG_LINE( "packaging relevant C++ method and extension object instance into PyCFunction" );

        // assume name was found in the method map
        MethodDefExt<T>* method_def_ext = i->second;

        // this must be the _self_and_name_tuple that gets received
        //   as the first parameter by the handler
        Tuple args{2};

        args[0] = Object{ this };
        args[1] = Object{ PyCapsule_New( (void*)method_def_ext, nullptr, nullptr ), true };

Construct a Python function that will call the handler for this method (while passing in this object args[0] the details of the method itself args1). The handler will take care of running the method while trapping errors.

Note that we don t execute the handler at this point Instead we return this Python function back to the Python runtime Maybe the Python coder didn t want the function executed but just wanted to grab a pointer to it: fp = MyExt.func;

        PyObject* func = PyCFunction_New( & method_def_ext->meth_def, args.ptr() );

X (see below) & method_def_ext->meth_def pulls out the handler function, which is one of three handlers However, thanks to MethodDefExt s constructors, they have all been typecast into PyCFunction objects Which means the parameter list is wrong for keywords handler.

        return Object(func, true);
    }

(I had to break out the comments as SO's formatter was not handling them as code comments)

What I'm struggling with is this: let's say foo is a function that takes keywords, so its signature will be:

MyExt::foo(PyObject* args, PyObject* kw)

The matching handler looks like this:

    static PyObject* noargs_handler( PyObject* _self_and_name_tuple, 
                                     PyObject*  ) { }

    static PyObject* varargs_handler( PyObject* _self_and_name_tuple, 
                                      PyObject* _args ) { }

    static PyObject* keyword_handler( PyObject* _self_and_name_tuple, 
                                      PyObject* _args, 
                                      PyObject* _keywords ) { }

i.e. The third one. I've read Python supplies an extra first _self_and_name_tuple parameter.

When we register foo into the lookup, we supply this handler:

    typedef                               Object (T::*method_keyword_function_t)( const Tuple& args, const Dict& kws );
    static void add_keyword_method( const char* name, method_keyword_function_t function ) {
        methods()[std::string{name}] = new MethodDefExt<T> {name,function,keyword_handler,doc};
    }

And looking at the particular constructor of MethodDefExt,

    // VARARGS + KEYWORD
    MethodDefExt (
        const char* _name,
        method_keyword_function_t _function,
        method_keyword_call_handler_t _handler
    )
    {
        meth_def.ml_name = const_cast<char *>( _name );
        meth_def.ml_doc  = nullptr;
        meth_def.ml_meth = reinterpret_cast<PyCFunction>( _handler );
        meth_def.ml_flags = METH_VARARGS | METH_KEYWORDS;

        ext_noargs_function = nullptr;
        ext_varargs_function = nullptr;
        ext_keyword_function = _function;
    }

... It can be seen that it typecasts this handler into a PyCFunction

But a PyCFunction only takes two arguments!!!

typedef PyObject *(*PyCFunction)(PyObject *, PyObject *);

We are typecasting handlers into this. And these handlers have 2 or 3 parameters.

This looks really wrong.

And then going back so when CPython wants to execute foo, as described above, it will fetch this meth_def.ml_meth and feed it into PyCFunction_New:

        Tuple args{2};

        args[0] = Object{ this };
        args[1] = Object{ PyCapsule_New( (void*)method_def_ext, nullptr, nullptr ), true };

        PyObject* func = PyCFunction_New( & method_def_ext->meth_def, args.ptr() ); // https://github.com/python/cpython/blob/master/Objects/methodobject.c#L19-L48

So I can make a guess: * the first parameter of PyCFunction_New must be a PyCFunction function pointer * the second parameter must be a PyObject* _self_and_name_tuple

And we are feeding this back to CPython My guess is that when CPython wants to use 'foo(7, a=1,b=2)' it will package 7 into args, a=1,b=2 into kwds, and call:

[the PyCFunction function pointer](_self_and_name_tuple, args, kwds)

Solution

  • I will hazard an answer:

    PyObject* PyCFunction_New(PyMethodDef* ml, PyObject* data)
    

    PyCFunction_New probably creates a Callable-Type PyObject, primed with a function (wrapped in ml) and additional data (wrapped in self)

    The second parameter could be anything, in fact it doesn't even need to be a PyObject*. When Python executes the function packaged inside ml, this will be the first argument. Subsequent arguments depend on ml->ml_flags, as detailed below.

    The first parameter is a PyMethodDef object, which we can use to encapsulate a function.

    struct PyMethodDef {
        const char  *ml_name;   /* The name of the built-in function/method */
        PyCFunction ml_meth;    /* The C function that implements it */
        int         ml_flags;   /* Combination of METH_xxx flags, which mostly
                                   describe the args expected by the C func */
        const char  *ml_doc;    /* The __doc__ attribute, or NULL */
    };
    typedef struct PyMethodDef PyMethodDef;
    

    So, it contains a (specific) function pointer:

    typedef PyObject *(*PyCFunction)(PyObject*, PyObject*);
    

    ... and a flag,

    /* Flag passed to newmethodobject */
    /* #define METH_OLDARGS  0x0000   -- unsupported now */
    #define METH_VARARGS  0x0001
    #define METH_KEYWORDS 0x0002
    /* METH_NOARGS and METH_O must not be combined with the flags above. */
    #define METH_NOARGS   0x0004
    #define METH_O        0x0008
    

    https://docs.python.org/3.4/c-api/structures.html

    We can pass 3 kinds of function to Python in this way:

    PyObject*foo( PyObject* data )                                 // ml_meth=METH_NOARGS
    PyObject*foo( PyObject* data, PyObject* args )                 // ml_meth=METH_VARARGS
    PyObject*foo( PyObject* data, PyObject* args, PyObject* kwds ) // ml_meth=METH_KEYWORDS
    

    EDIT: https://docs.python.org/3/tutorial/classes.html#method-objects

    If you still don’t understand how methods work, a look at the implementation can perhaps clarify matters. When an instance attribute is referenced that isn’t a data attribute, its class is searched. If the name denotes a valid class attribute that is a function object, a method object is created by packing (pointers to) the instance object and the function object just found together in an abstract object: this is the method object. When the method object is called with an argument list, a new argument list is constructed from the instance object and the argument list, and the function object is called with this new argument list.