Search code examples
pythonc++inheritancemultiprocessingboost-python

missing arguments in __init__ when using using multiprocessing with a python object that uses a C++ class as base class


I have a class Node in C++ that I exposed to python using Boost/Python (yes, I know that Node is small but I translated it because some pretty big classes are derived from it). According to the boost/python documentation, I should be able to do this if I implement getstate_manages_dict. Here's the minimal amount of code that I used to replicate the error:

header file:

class Node{
public:
    Node(boost::python::object position);
    ~Node();
    boost::python::object _position;
};

// pickle support for Node
struct node_pickle_suite : boost::python::pickle_suite{
    static boost::python::tuple getinitargs(Node const& node){
        return boost::python::make_tuple(node._position);
    }
    static boost::python::tuple getstate(boost::python::object obj)
    {
        Node& node = boost::python::extract<Node&>(obj);
        return boost::python::make_tuple(obj.attr("__dict__"));
    }

    static void setstate(boost::python::object obj, boost::python::tuple state)
    {
        Node& node = boost::python::extract<Node&>(obj);
        boost::python::dict d = extract<dict>(obj.attr("__dict__"));
        d.update(state[0]);
        
    }
    static bool getstate_manages_dict() { return true; }

};

cpp file:

#include "node.h"
using namespace std;

Node::Node(boost::python::object position){
    this->_position = position;
}

Node::~Node(){

}


BOOST_PYTHON_MODULE(Node){
    class_<Node>("Node", init<boost::python::object>())

    .def_readwrite("_position", &Node::_position)

    .def_pickle(node_pickle_suite());
}

In python, I have a class that uses Node as a base class, and a process that needs to put some TestNodes in a queue. While q.put works, q.get does not:

class Position:
    def __init__(self, i, j ,k):
        self._i = i 
        self._j = j 
        self._k = k

class TestNode(Node):
    def __init__(self, position, l, m, n):
        super().__init__(position)
        self.l = l
        self.m = m
        self.n = n
        self.c = {}
        self.d = {}
        self.dq = deque()

from multiprocessing import Process, Queue
def do_stuff(q):
    for i in range(100):
        pos = Position(1,2,3)
        tn = TestNode(pos, 10, "hello", "world")
        q.put(tn)

processes = []
q = Queue()

for i in range(3):
    processes.append(Process(target=do_stuff, args=(q,)))
for process in processes:
    process.start()
# the program crashes when this is called
a = q.get()
for process in processes:
    process.join()

the error message (changed path to avoid reveiling personal information):

  File "path/testNode.py", line 34, in <module>
    a = q.get()
  File "path/lib/python3.7/multiprocessing/queues.py", line 113, in get
    return _ForkingPickler.loads(res)
TypeError: __init__() missing 3 required positional arguments: 'l', 'm', and 'n'

this did not cause any problem whith pure python code or if I just put a Node object in the queue, only with classes that inherits from Node. Anyone knows how to fix it? Also I couldn't find anything specific to this kind of problem, so any additional documentation would be useful too.

Thanks


Solution

  • Figured it out:

    Seems like inheriting from C++ would break the pickle support that comes with pure Python classes. I needed to add the following methods in TestNode:

    __getinitargs__(self), __getstate__(self), __setstate__(self, state)