Search code examples
python-importpython-modulepython-c-api

How can I get a custom python type and avoid importing a python module every time a C function is called


I am writing some functions for a C extension module for python and need to import a module I wrote directly in python for access to a custom python type. I use PyImport_ImportModule() in the body of my C function, then PyObject_GetAttrString() on the module to get the custom python type. This executes every time the C function is called and seems like it's not very efficient and may not be best practice. I'm looking for a way to have access to the python custom type as a PyObject* or PyTypeObject* in my source code for efficiency and I may need the type in more than one C function also.

Right now the function looks something like

static PyObject* foo(PyObject* self, PyObject* args) 
{
    PyObject* myPythonModule = PyImport_ImportModule("my.python.module");
    if (!myPythonModule)
        return NULL;

    PyObject* myPythonType = PyObject_GetAttrString(myPythonModule, "MyPythonType");
    if (!myPythonType) {
        Py_DECREF(myPythonModule);
        return NULL;
    }

    /* more code to create and return a MyPythonType instance */
}

To avoid retrieving myPythonType every function call I tried adding a global variable to hold the object at the top of my C file

static PyObject* myPythonType;

and initialized it in the module init function similar to the old function body

PyMODINIT_FUNC
PyInit_mymodule(void)
{
    /* more initializing here */

    PyObject* myPythonModule = PyImport_ImportModule("my.python.module");
    if (!myPythonModule) {
        /* clean-up code here */
        return NULL;
    }

    // set the static global variable here
    myPythonType = PyObject_GetAttrString(myPythonModule, "MyPythonType");
    Py_DECREF(myPythonModule);
    if (!myPythonType) {
        /* clean-up code here */
        return NULL;

    /* finish initializing module */
}

which worked, however I am unsure how to Py_DECREF the global variable whenever the module is finished being used. Is there a way to do that or even a better way to solve this whole problem I am overlooking?


Solution

  • First, just calling import each time probably isn't as bad as you think - Python does internally keep a list of imported modules, so the second time you call it on the same module the cost is much lower. So this might be an acceptable solution.

    Second, the global variable approach should work, but you're right that it doesn't get cleaned up. This is rarely a problem because modules are rarely unloaded (and most extension modules don't really support it), but it isn't great. It also won't work with isolated sub-interpreters (which isn't much of a concern now, but may become more more popular in future).

    The most robust way to do it needs multi-phase initialization of your module. To quickly summarise what you should do:

    • You should define a module state struct containing this type of information,
    • Your module spec should contain the size of the module state struct,
    • You need to initialize this struct within the Py_mod_exec slot.
    • You need to create an m_free function (and ideally the other GC functions) to correctly decref your state during de-initialization.
    • Within a global module function, self will be your module object, and so you can get the state with PyModule_GetState(self)