I need to speed up some algorithms working on NumPy arrays. They will use std::vector
and some of the more advanced STL data structures.
I've narrowed my choices down to Cython (which now wraps most STL containers) and Boost.Python (which now has built-in support for NumPy).
I know from my experience as a programmer that sometimes it takes months of working with a framework to uncover its hidden issues (because they are rarely used as talking points by its disciples), so your help could potentially save me a lot of time.
What are the relative advantages and disadvantages of extending NumPy in Cython vs Boost.Python?
This is a very incomplete answer that only really covers a couple of small parts of it (I'll edit it if I think of anything more):
Boost doesn't look to implement operator[]
specifically for numpy arrays. This means that operator[]
will come from the base object
class (that ndarray
inherits), which will mean the call will go through the Python mechanisms to __getitem__
and so indexing will be slow (close to Python speed). If you want to do indexing at speed you'll have to do pointer arithmetic yourself:
// rough gist - untested:
// i,j,k are your indices
double* data = reinterpret_cast<double*>(array.get_data());
// in reality you'd check the dtype - the data may not be a double...
double data_element = array.strides(0)*i + array.strides(1)*j +array.strides(2)*k;
In contrast Cython has efficient indexing of numpy arrays built in automatically.
Cython isn't great at things like std::vector
(although it isn't absolutely terrible - you can usually trick it into doing what you want). One notable limitation is that all cdef
s have to go at the start of the function so C++ classes with be default constructed there, and then assigned to/manipulated later (which can be somewhat inefficient). For anything beyond simple uses you do not want to be manipulating C++ types in Cython (instead it's better to write the code in C++ then call it from Cython).
A second limitation is that it struggles with non-class templates. One common example is std::array
, which is templated with a number. Depending on your planned code this may or may not be an issue.