Search code examples
phpparallel-processingzeromqpcntl

Parallel processing in PHP using zeroMQ


Some background: I am building a server application in php that will need to execute a number of independent tasks on a user request. Theres is a severe requirement on speed for my application so I would like to execute all of those tasks in parallel.

I've looked at several solutions (e.g gearman, rabbitMQ, zeroMQ) and I've decided to go with zeroMQ (fast, good docs, flexible, and doesn't require a broker). This solves the communication/sync problem between the threads for me.

Question: I would like to initiate the tasks only when the server receives a request (not to have a long running process). So I receive a request -> start parallel computation -> return the result of the computation to the client. One solution for that seems to be pcntl_fork however the docs mention that there ares some issues with using it in a production server env but doesn't really specify what they are?

My other option is to use proc_open, but I like it less because it would require me to serialize the inputs in some way which seems less flexible and fast then forking. Does it have any advantages over pcntl_fork?

Is there another solution (still using php :p)?


Solution

  • Tread carefully, I see several red-flags in your question that lead me to believe you are concerned about things that maybe you don't need to be, and you probably aren't concerned with things you should be.

    You say you have a severe requirement for speed - have you validated that normal single threaded PHP is not fast enough? Run any benchmarks, figured out your bottlenecks? If your speed requirement is that great, you might even consider using a different language, for all of PHP's charms it's never going to be the most efficient hammer in the toolbox. Java is a good option for all-out speed, and node.js is a good option if your bottlenecks are IO dependent. My main concern is that, absent more information, this question smells of premature optimization. This may be unfair and you may have omitted those details because it wasn't the heart of your question, but as an outsider I at least wanted to make sure that you think about these things if you haven't already.

    You want to avoid long-running processes - why? There's nothing inherently wrong with long-running processes - but it does feel wrong when what you're used to is the pseudo-efficient "on-demand" nature of Apache+mod_php. Be sure you're not trying to avoid something just because you're not used to it.

    What you seem to be describing is performing parallel processing from within your PHP web-app - just like any other web page you write, Apache initiates your PHP script, that script forks another process and, rather than performing its actions serially, performs them in parallel, completes and returns to the user at the completion of the page-render. If that's correct, then here is the answer to your original question:

    You cannot use pcntl_fork from within a web process, only from the command line. The details of this are on the page you linked to, down in the comments:

    It's not a matter of "should not", it's "can not". Even though I have compiled in PCNTL with --enable-pcntl, it turns out that it only compiles in to the CLI version of PHP, not the Apache module. [...] function_exists('pcntl_fork') was returning false even though it compiled correctly. It turns out it returns true just fine from the CLI, and only returns false for HTTP requests. The same is true of ALL of the pcntl_*() functions.

    ... which means that either you'll have to initiate your forking process as a separate long-running process, or you'll have to start it on demand with proc_open, there is no way to get it to work the way I assume you want it to.