Returning and passing around raw POD pointers (arrays) with Python, C++, and pybind11

I have a C++ function which returns a raw float pointer, and another C++ function which accepts a raw float pointer as an argument. Something like:

float* ptr = something;
float* get_ptr(void) { return ptr; }
void use_ptr(float* ptr) { do_work(ptr); }

I want to be able to pass around pointers using Python. Something like this:

import my_native_functions as native
ptr = native.get_ptr()
native.use_ptr(ptr)

I am using pybind11 to create my native python module but I don't know how to create the bindings for the get_ptr() function. If I just do the following:

PYBIND11_MODULE(my_native_functions, m)
{
    m.def("get_ptr", &get_ptr);
    m.def("use_ptr", &use_ptr);
}

the get_ptr() function returns a Python Float object. I guess this makes sense because there are no pointer types in python. However, because this is now a simple Float, when I call the use_ptr() function and iterate over the pointer in C/C++, only the first element of the array is correct. The rest are garbage. To fix this, in C++, I have to cast my pointer to/from std::size_t. By doing this, everything works just fine.

However, I would like to ask: Is there a "right way" of achieving the above without the casting to/from std::size_t with pybind11?

In case you are curious why I do that: I do understand that what I am doing is not type-safe. Also, I never touch the pointer/integer on the Python side. I just retrieve it from one native module and pass it to another. Also, I cannot cast the pointer to some kind of numpy view because the pointer is not always on the CPU. Sometimes I want to pass around CUDA pointers. Creating a py::array_t from a CUDA pointer is not possible unless I copy the data (and I do not want to do that).

Thank you.

Solution

Wrap the raw pointer in a custom "smart" pointer class (only pretending to be smart really) as described here. You can add some additional information to this class while you're at it, such as the size of the array element and the number of elements. This would make it a generalised array descriptor on the C++ side (but not on the Python side because you are not exposing the raw pointer to Python).

For a simpler option, just wrap your pointer in any old class in order to hide it from Python. No need to expose it to Python as a custom smart pointer. Here's an example that does just that:

#include <pybind11/pybind11.h>
#include <memory>
#include <iostream>

namespace py = pybind11;

template <class T> class ptr_wrapper
{
    public:
        ptr_wrapper() : ptr(nullptr) {}
        ptr_wrapper(T* ptr) : ptr(ptr) {}
        ptr_wrapper(const ptr_wrapper& other) : ptr(other.ptr) {}
        T& operator* () const { return *ptr; }
        T* operator->() const { return  ptr; }
        T* get() const { return ptr; }
        void destroy() { delete ptr; }
        T& operator[](std::size_t idx) const { return ptr[idx]; }
    private:
        T* ptr;
};

float array[3] = { 3.14, 2.18, -1 };

ptr_wrapper<float> get_ptr(void) { return array; }
void use_ptr(ptr_wrapper<float> ptr) {
    for (int i = 0; i < 3; ++i)
        std::cout << ptr[i] << " ";
    std::cout << "\n";
}

PYBIND11_MODULE(Ptr,m)
{
    py::class_<ptr_wrapper<float>>(m,"pfloat");
    m.def("get_ptr", &get_ptr);
    m.def("use_ptr", &use_ptr);
}