Python C extension - maintaining state

I need to write a Python extension in C that I will use to:

perform CPU-intensive initialization on a file;
make multiple function calls that rely on the initialized data to return results to me; and
free memory when I'm done

One solution is to implement a "state holder" class in Python. When I call the initialization in C, it returns me the initialized data that I will store in my Python state object. Then every time I need to do step (2), I would pass it to the C function. But this seems very inefficient with all the data shuttling/interfacing occurring between the Python side and the C side.

If possible, I would like to maintain the state using a state object on the C side. The initialization call from the Python side would return not all the initialized data, but just an ID so it can reference the C state object when it needs to during subsequent calls.

How would I go about maintaining state on the C side?

Solution

First, I'll answer the question you actually asked.

Create a struct State in C, just as you would if Python weren't involved.

If you're not going to be copying these around (you only pass them by struct State *), then you can just do (intptr_t)theStatePtr to get an id for Python. Of course you do need to be careful that the lifetime of the Python object never extends past the lifetime of the C object, but that's doable.

If you do need to copy/move the struct around for some reason, or you need more help managing state (e.g., treating the Python ids as weak references), pick the appropriate collection (hash table, tree, array, etc.) for your use case, then pass the key to Python as an id.

However, I think you may be optimizing the wrong part here. Passing the object back and forth is nothing—it's just a pointer copy. Refcounting may be an issue, but it rarely is, and the benefits you get from lifecycle management are usually worth it. The part that may kill performance is your C code continually converting a bunch of Python integers to C ints, etc. If this is your problem, just create a C struct with the C state, and wrap it in a Python object that doesn't expose any of the internals into Python.

Finally, do you actually need any optimization here at all? If you're doing CPU-intensive work, I'll bet the real work so completely overshadows the cost of the Python object access that the latter won't even show up in profiling. If you haven't profiled yet, that's absolutely positively the first thing you should do, because the right answer here may well be "don't bother doing anything".

Taking that a step further: If you're only writing the C code in C for optimization, are you sure you even need that? Dealing with memory management in C is annoying and error-prone, dealing with it in a C extension module for Python even more so, doing it for the first time when you don't already know how it works is almost a guaranteed recipe for spending all your time chasing down segfaults and leaks rather than writing your actual code. So, I would try the following in order, profiling each and only moving down the list if it's too slow:

Just write the algorithm in Python, and use your existing CPython interpreter.
Make sure you've got an optimal algorithm.
Try PyPy instead of CPython.
Get Cython and try compiling your Python code with as few changes as possible.
Modify your code to take advantage of Cython features like static types, direct calls to C functions, etc., as appropriate.
Write the lower-level code in C, the mid-level code (the stuff that tracks your state objects and presents a wrapper to Python) either in Cython, or in Python with ctypes.
Write the whole lower and mid level in C, using your favorite interface mechanism. Which is still probably not the native C API, unless you've got a lot of experience and are doing something pretty simple.