Search code examples
c++pythonembeddingextending

Speed - embedding python in c++ or extending python with c++


I have some big mysql databases with data for calculations and some parts where I need to get data from external websites.

I used python to do the whole thing until now, but what shall I say: its not a speedster.

Now I'm thinking about mixing Python with C++ using Boost::Python and Python C API.

The question I've got now is: what is the better way to get some speed. Shall I extend python with some c++ code or shall I embedd python code into a c++ programm?

I will get fore sure some speed increment using c++ code for the calculating parts and I think that calling the Python interpreter inside of an C-application will not be better, because the python interpreter will run the whole time. And I must wrap things python-libraries like mysqldb or urllib3 to have a nice way to work inside c++.

So what whould you suggest is the better way to go: extending or embedding? ( I love the python language, but I'm also familiar with c++ and respect it for speed )

Update: So I switched some parts from python to c++ and used multi threading (real one) in my c modules and my programm now needs instead of 7 hours 30 minutes :))))


Solution

  • In principle, I agree with the first two answers. Anything coming from disk or across a network connection is likely to be a bigger bottleneck than the application.

    All the research of the last 50 years indicates that people often have inaccurate intuition about system performance issues. So IMHO, you really need to gather some evidence, by measuring what is actually happening, then chose a solution based on that evidence.

    To try to confirm what is causing the slow performance, measure the system and user time of your application (e.g time python prog.py), and measure the load on the machine.

    It the application is maxing-out the CPU, and most of that time is spent in the application (user time), then there may be a case for using a more effective technology for the application.

    But if the CPU is not maxed, or the application spends most of its time in the system (system time), and not in the application (user time), then it is unlikely that changing the application programming technology will help significantly. (This is an example of Amdahl's Law http://en.wikipedia.org/wiki/Amdahl%27s_law)

    You may also need to measure the performance of your database server, and maybe network connection, to identify the source of the bottle neck, but start with the easiest part.