Search code examples
pythonmultithreadingmultiprocess

Python multiple processes instead of threads?


I am working on a web backend that frequently grabs realtime market data from the web, and puts the data in a MySQL database.

Currently I have my main thread push tasks into a Queue object. I then have about 20 threads that read from that queue, and if a task is available, they execute it.

Unfortunately, I am running into performance issues, and after doing a lot of research, I can't make up my mind.

As I see it, I have 3 options: Should I take a distributed task approach with something like Celery? Should I switch to JPython or IronPython to avoid the GIL issues? Or should I simply spawn different processes instead of threads using processing? If I go for the latter, how many processes is a good amount? What is a good multi process producer / consumer design?

Thanks!


Solution

  • Maybe you should use an event-driven approach, and use an event-driven oriented frameworks like twisted(python) or node.js(javascript), for example this frameworks make use of the UNIX domain sockets, so your consumer listens at some port, and your event generator object pushes all the info to the consumer, so your consumer don't have to check every time to see if there's something in the queue.