I am trying to use OpenMP on a list of Python objects by using Pybind11 in C++. I transform this list in an std::vector of Python objects (as explained in this post) and then try to access them in a parallelized for loop. However, when invoking the attributes of any python object in the vector in the for loop, I get the error:
Fatal Python error: deletion of interned string failed
Thread 0x00007fd282bc7700 (most recent call first):
Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
My questions are: What is the deletion of interned string error ? and how to avoid it in OpenMP ?
I have read here that the problem is with respect to the copy of the string, so I tried to refer to the string with a pointer but it didn't help. Also, the problem doesn't come from a conversion problem in Pybind, because if I remove the #pragma omp
clause, the code works perfectly.
C++ Code
#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
#include <pybind11/stl.h>
#include <omp.h>
#include <chrono>
#include <thread>
namespace py = pybind11;
py::object create_seq(
py::object self
){
std::vector<py::object> dict = self.cast<std::vector<py::object>>();
#pragma omp parallel for
for(unsigned int i=0; i<dict.size(); i++) {
dict[i].attr("attribute") = 2;
}
return self;
}
PYBIND11_MODULE(error, m){
m.doc() = "pybind11 module for iterating over generations";
m.def("create_seq", &create_seq,
"the function which creates a sequence");
}
Python Code
import error
class test():
def __init__(self):
self.attribute = None
if __name__ == '__main__':
dict = {}
for i in range(50):
dict[i] = test()
pop = error.create_seq(list(dict.values()))
Compiled with:
g++ -O3 -Wall -shared -std=c++14 -fopenmp -fPIC `python3 -m pybind11 --includes` openmp.cpp -o error.so
I was able to find a solution, but I think I am just doing a single threaded work with multiple threads. I used a #pragma omp ordered in the following way:
std::vector<py::object> dict = self.cast<std::vector<py::object>>();
#pragma omp parallel for ordered schedule(dynamic)
for(unsigned int i=0; i<dict.size(); i++) {
py::object genome = dict[i];
std::cout << i << std::endl;
#pragma omp ordered
genome.attr("fitness")=2;
}
And this works
EDIT
I controlled the execution time with and without parallelization and it's the same