I posted here not too long ago about a model I am trying to build using pycuda which solves About 9000 coupled ODEs. My model is too slow however and an SO member suggested that memory transfers from host to GPU is probably the culprit.
Right now cuda is being used only to calculate the rate of change of each of the 9000 species I am dealing with. Since I am passing in an array from the host to the GPU to perform this calculation and returning an array from the GPU to integrate on the host I can see how this would slow things down.
Would boost be the solution to my problem? From what I read, boost allows interoperability between c++ and python. It also includes c++ odeint , which I read, partnered with thrust allows quick reduction and integration all on the GPU. Is my understanding correct?
Thank you, Karsten
Yes, boost.odeint and boost.python should solve your problem. You can use odeint with Thrust. There are also some OpenCL libraries (VexCL, ViennaCL) which might be easier to use then Thrust. Have a look at thist paper for a comparions and for use cases of odeint on GPUs.
Boost.python can do the communication between the C++ application and Python. Another approach would be a very slim command line application for solving the ODE (using boost.odeint) and which is entirely controlled by your python application.