Search code examples
pythonc++pybind11

How do I create a keep_alive relationship between an object and the members of an aggregate return value in pybind11?


I'm using pybind11 to generate Python mappings for a C++ library, and I'm using pybind11::keep_alive to manage the lifetime of C++ objects that are referenced by other objects. This works fine when the referrer is a direct return value, but I'm running into trouble when the referrer is part of an aggregate return value.

It's probably easiest to illustrate with a simplified but complete example:

#include <iostream>

#include <pybind11/pybind11.h>
#include <pybind11/stl.h>

namespace py = pybind11;

class Container {
public:
  int some_value;

  Container() {
    std::cerr << "Container constructed\n";
    some_value = 42;
  }

  ~Container() {
    some_value = -1;
    std::cerr << "Container destructed\n";
  }

  // Container is not copyable.
  Container(const Container &) = delete;
  Container &operator=(const Container &) = delete;

  // Iterator references the contents of Container.
  struct Iterator {
    Container *owner;
    int value() const { return owner->some_value; }
  };

  Iterator iterator() { return Iterator{this}; }

  std::pair<Iterator, bool> iterator_pair() { return {Iterator{this}, true}; }
};

PYBIND11_MODULE(example, module) {
  py::class_<Container> owner(module, "Container");

  py::class_<Container::Iterator>(owner, "Iterator")
      .def_property_readonly("value", &Container::Iterator::value);

  owner
      .def(py::init())
      .def("iterator",      &Container::iterator,      py::keep_alive<0, 1>())
      .def("iterator_pair", &Container::iterator_pair, py::keep_alive<0, 1>());
}

In the above example, I have a container object that returns an iterator, which references the container's content, so the iterator must keep the container alive. This works perfectly for the iterator method, where py::keep_alive<0, 1> causes the Iterator object to keep a reference to the Container object that created it.

However, this doesn't work for the iterator_pair method. For example, running like this:

import example

def Test1():
    it = example.Container().iterator()
    print(it.value)  # prints 42

def Test2():
    it, b = example.Container().iterator_pair()
    assert b == True
    print(it.value)

Test1()  # OK
Test2()  # Fails!

Fails with the following output:

Owner constructed
42
Owner destructed
Owner constructed
Owner destructed
Traceback (most recent call last):
  File "test.py", line 13, in <module>
    Test2()  # Fails!
    ^^^^^^^
  File "test.py", line 8, in Test2
    it, b = example.Container().iterator_pair()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: cannot create weak reference to 'tuple' object

And I cannot remove py::keep_alive(0, 1), because that would cause the Container to be destructed prematurely:

Owner constructed
42
Owner destructed
Owner constructed
Owner destructed
269054512

(Note that 269054512 is just random garbage in memory because the Owner is already deallocated when its value is referenced by the iterator.)

What is the recommended way to handle situations like this?

I've also encountered similar problems with returning std::vectors of objects that must keep their parent alive, so I think the more general question is how to return aggregate objects that contain subobjects which must keep their parent alive.

One workaround I can think of, is to wrap Container in a std::shared_ptr<> and use that in the Iterator too. That way, the C++ Container object stays alive regardless of whether it is still referenced in Python. However, I'm not really eager to do this, because it requires a lot of refactoring on the C++ side, and it feels like I'm duplicating the reference counting support that Python already has.


Solution

  • You can construct your own version of keep_alive:

    struct keep_alive_container {};
    
    namespace pybind11::detail {
    
        template <>
        struct process_attribute<keep_alive_container>
            : public process_attribute_default<keep_alive_container> {
            static void precall(function_call&) {}
            static void postcall(function_call& call, handle ret) {
                keep_alive_impl(ret.attr("__getitem__")(0), call.args[0]);
            }
        };
    } 
    
    // later on
    .def("iterator_pair", &Container::iterator_pair, keep_alive_container())
    

    This requires calling keep_alive_impl which is a private pybind11 function, so that might not be recommended.