Search code examples
pythonc++swig

Adding swig pythoncode to set thisown flag on Python object


I have a swigged C++ class container, MyContainer, holding objects of type MyObject, also a C++ class.

The following is the C++ header code (freemenot.h)

#ifndef freemenotH
#define freemenotH
#include <vector>
#include <string>

using std::string;
class MyObject
{
    public:
                        MyObject(const string& lbl);
                        ~MyObject();
        string          getLabel();

    private:
        string          label;
};

class MyContainer
{
    public:
                    MyContainer();
                    ~MyContainer();
        void        addObject(MyObject* o);
        MyObject*   getObject(unsigned int t);
        int         getNrOfObjects();

    private:
        std::vector<MyObject*>   mObjects;
};

#endif

and this is the source (freemenot.cpp)

#include "freemenot.h"
#include <iostream>
using namespace std;

/* MyObject source */
MyObject::MyObject(const string& lbl)
:
label(lbl)
{ cout<<"In object ctor"<<endl; }

MyObject::~MyObject() { cout<<"In object dtor"<<endl; }
string MyObject::getLabel() { return label; }


/* MyContainer source */
MyContainer::MyContainer() { cout<<"In container ctor"<<endl; }

MyContainer::~MyContainer()
{
    cout<<"In container dtor"<<endl;
    for(unsigned int i = 0; i < mObjects.size(); i++)
    {
        delete mObjects[i];
    }
}

int MyContainer::getNrOfObjects() { return mObjects.size(); }
void MyContainer::addObject(MyObject* o) { mObjects.push_back(o); }
MyObject* MyContainer::getObject(unsigned int i) { return mObjects[i]; }

Observe that the objects are stored as RAW POINTERS in the vector. The class is such designed, and the container is thus responsible to free the objects in its destructor, as being done in the destructors for loop.

In C++ code, like below, an object o1 is added to the container c, which is returned to client code

MyContainer* getAContainerWithSomeObjects()
{
  MyContainer* c = new MyContainer();
  MyObject* o1 = new MyObject();
  c.add(o1);
  return c;
}

The returned container owns its objects, and are responsible to de-allocate these objects when done. In C++, access to the containers objects is fine after the function exits above.

Exposing the above classes to python, using Swig, will need an interface file. This interface file looks like this

%module freemenot
%{    #include "freemenot.h"    %}
%include "std_string.i"
//Expose to Python
%include "freemenot.h"

And to generate a Python module, using CMake, the following CMake script was used.

cmake_minimum_required(VERSION 2.8)
project(freemenot)

find_package(SWIG REQUIRED)
include(UseSWIG)
find_package(PythonInterp)
find_package(PythonLibs)
get_filename_component(PYTHON_LIB_FOLDER ${PYTHON_LIBRARIES} DIRECTORY CACHE)
message("Python lib folder: " ${PYTHON_LIB_FOLDER})
message("Python include folder: " ${PYTHON_INCLUDE_DIRS})
message("Python libraries: " ${PYTHON_LIBRARIES})

set(PyModule "freemenot")
include_directories(
    ${PYTHON_INCLUDE_PATH}
    ${CMAKE_CURRENT_SOURCE_DIR}
)

link_directories( ${PYTHON_LIB_FOLDER})

set(CMAKE_MODULE_LINKER_FLAGS ${CMAKE_CURRENT_SOURCE_DIR}/${PyModule}.def)

set_source_files_properties(${PyModule}.i PROPERTIES CPLUSPLUS ON)
set_source_files_properties(${PyModule}.i PROPERTIES SWIG_FLAGS "-threads")

SWIG_ADD_LIBRARY(${PyModule}
    MODULE LANGUAGE python
    SOURCES ${PyModule}.i freemenot.cpp)

SWIG_LINK_LIBRARIES (${PyModule} ${PYTHON_LIB_FOLDER}/Python37_CG.lib    )

# INSTALL PYTHON BINDINGS
# Get the python site packages directory by invoking python
execute_process(COMMAND python -c "import site; print(site.getsitepackages()[0])" OUTPUT_VARIABLE PYTHON_SITE_PACKAGES OUTPUT_STRIP_TRAILING_WHITESPACE)
message("PYTHON_SITE_PACKAGES = ${PYTHON_SITE_PACKAGES}")

install(
    TARGETS _${PyModule}
    DESTINATION ${PYTHON_SITE_PACKAGES})

install(
    FILES         ${CMAKE_CURRENT_BINARY_DIR}/${PyModule}.py
    DESTINATION   ${PYTHON_SITE_PACKAGES}
)

Generating the make files using CMake, and compiling using borlands bcc32 compiler, a Python module (freemenot) is generated and installed into a python3 valid sitepackages folder.

Then, in Python, the following script can be used to illuminate the problem

import freemenot as fmn

def getContainer():
   c = fmn.MyContainer()
   o1 = fmn.MyObject("This is a label")
   o1.thisown = 0
   c.addObject(o1)
   return c

c = getContainer()
print (c.getNrOfObjects())

#if the thisown flag for objects in the getContainer function
#is equal to 1, the following call return an undefined object
#If the flag is equal to 0, the following call will return a valid object
a = c.getObject(0)
print (a.getLabel())

This Python code may look fine, but don't work as expected. Problem is that, when the function getContainer() returns, the memory for object o1 is freed, if the thisown flag is not set to zero. Accessing the object after this line, using the returned container will end up in disaster. Observe, there is not nothing wrong with this per se, as this is how pythons garbage collection works.

For the above use case being able to set the python objects thisown flag inside the addObject function, would render the C++ objects usable in Python. Having the user to set this flag is no good solution. One could also extend the python class with an "addObject" function, and modifying the thisown flag inside this function, and thereby hiding this memory trick from the user.

Question is, how to get Swig to do this, without extending the class? I'm looking for using a typemap, or perhaps %pythoncode, but I seem not able to find a good working example.

The above code is to be used by, and passed to, a C++ program that is invoking the Python interpreter. The C++ program is responsible to manage the memory allocated in the python function, even after PyFinalize().

The above code can be downloaded from github https://github.com/TotteKarlsson/miniprojects


Solution

  • There are a number of different ways you could solve this problem, so I'll try and explain them each in turn, building on a few things along the way. Hopefully this is useful as a view into the options and innards of SWIG even if you only really need the first example.

    Add Python code to modify thisown directly

    The solution most like what you proposed relies on using SWIG's %pythonprepend directive to add some extra Python code. You can target it based on the C++ declaration of the overload you care about, e.g.:

    %module freemenot
    %{    #include "freemenot.h"    %}
    %include "std_string.i"
    
    %pythonprepend MyContainer::addObject(MyObject*) %{
    # mess with thisown
    print('thisown was: %d' % args[0].thisown)
    args[0].thisown = 0
    %}
    
    //Expose to Python
    %include "freemenot.h"
    

    Where the only notable quirk comes from the fact that the arguments are passed in using *args instead of named arguments, so we have to access it via position number.

    There are several other places/methods to inject extra Python code (provided you're not using -builtin) in the SWIG Python documentation and monkey patching is always an option too.

    Use Python's C API to tweak thisown

    The next possible option here is to use a typemap calls the Python C API to perform the equivalent functionality. In this instance I've matched on the argument type and argument name, but that does mean the typemap here would get applied to all functions which receive a MyObject * named o. (The easiest solution here is to make the names describe the intended semantics in the headers if that would over-match currently which has the side benefit of making IDEs and documentation clearer).

    %module freemenot
    %{    #include "freemenot.h"    %}
    %include "std_string.i"
    
    %typemap(in) MyObject *o {
        PyObject_SetAttrString($input, "thisown", PyInt_FromLong(0)); // As above, but C API
        $typemap(in,MyObject*); // use the default typemap
    }
    
    //Expose to Python
    %include "freemenot.h"
    

    The most noteworthy point about this example other than the typemap matching is the use of $typemap here to 'paste' another typemap, specifically the default one for MyObject* into our own typemap. It's worth having a look inside the generated wrapper file at a before/after example of what this ends up looking like.

    Use SWIG runtime to get at SwigPyObject struct's own member directly

    Since we're already writing C++ instead of going via the setattr in the Python code we can adapt this typemap to use more of SWIG's internals and skip a round-trip from C to Python and back to C again.

    Internally within SWIG there's a struct that contains the details of each instance, including the ownership, type etc.

    We could just cast from PyObject* to SwigPyObject* ourselves directly, but that would require writing error handling/type checking (is this PyObject even a SWIG one?) ourselves and become dependent on the details of the various differing ways SWIG can produce Python interfaces. Instead there's a single function we can call which just handles all that for us, so we can write our typemap like this now:

    %module freemenot
    %{    #include "freemenot.h"    %}
    %include "std_string.i"
    
    %typemap(in) MyObject *o {
        // TODO: handle NULL pointer still
        SWIG_Python_GetSwigThis($input)->own = 0; // Safely cast $input from PyObject* to SwigPyObject*
        $typemap(in,MyObject*); // use the default typemap
    }
    
    //Expose to Python
    %include "freemenot.h"
    

    This is just an evolution of the previous answer really, but implemented purely in the SWIG C runtime.

    Copy construct a new instance before adding

    There are other ways to approach this kind of ownership problem. Firstly in this specific instance your MyContainer assumes it can always call delete on every instance it stores (and hence owns in these semantics).

    The motivating example for this would be if we were also wrapping a function like this:

    MyObject *getInstanceOfThing() {
        static MyObject a;
        return &a;
    }
    

    Which introduces a problem with our prior solutions - we set thisown to 0, but here it would already have been 0 and so we still can't legally call delete on the pointer when the container is released.

    There's a simple way to deal with this that doesn't require knowing about SWIG proxy internals - assuming MyObject is copy constructable then you can simply make a new instance and be sure that no matter where it came from it's going to be legal for the container to delete it. We can do that by adapting our typemap a little:

    %module freemenot
    %{    #include "freemenot.h"    %}
    %include "std_string.i"
    
    %typemap(in) MyObject *o {
        $typemap(in,MyObject*); // use the default typemap as before
        $1 = new $*1_type(*$1); // but afterwards call copy-ctor
    }
    
    //Expose to Python
    %include "freemenot.h"
    

    The point to note here is the use of several more SWIG features that let us know the type of the typemap inputs - $*1_type is the type of the typemap argument dereferenced once. We could have just written MyObject here, as that's what it resolves to but this lets you handle things like templates if your container is really a template, or re-use the typemap in other similar containers with %apply.

    The thing to watch for here now is leaks if you had a C++ function that you were deliberately allowing to return an instance without thisown being set on the assumption that the container would take ownership that wouldn't now hold.

    Give the container a chance to manage ownership

    Finally one of the other techniques I like using a lot isn't directly possible here as currently posed, but is worth mentioning for posterity. If you get the chance to store some additional data along side each instance in the container you can call Py_INCREF and retain a reference to the underlying PyObject* no matter where it came from. Provided you then get a callback at destruction time you can also call Py_DECREF and force the Python runtime to keep the object alive as long as the container.

    You can also do that even when it's not possible to keep a 1-1 MyObject*/PyObject* pairing alive, by keeping a shadow container alive somewhere also. That can be hard to do unless you're willing to add another object into the container, subclass it or can be very certain that the initial Python instance of the container will always live long enough.