Search code examples
.net-4.0parallel-processingmsmqtask-parallel-library

How to process MSMQ messages in parallel


I'm writing a windows service to consume MSMQ messages. The service will have periods of high activity (80k messages coming in very quickly) and long periods of inactivity (could be several days without a new message).

Processing the messages is very network-bound, so I get a big benefit out of parallelism. But during periods of inactivity, I don't want to tie up a bunch of threads waiting for messages that aren't coming anytime soon.

The MSMQ interface seems to be very focused on a synchronous workflow - get one message, process it, get another, etc. How should I structure my code so that I can take advantage of parallelism during periods of high activity but not tie up a bunch of threads during periods of no activity? Bonus points for using the TPL. Pseudocode would be appreciated.


Solution

  • I have done allot of MSMQ (including mobile implementations) over the years and you are correct in the characterization of "synchronous workflow." It's not that you can't take the various message envelops and process them across the different cores via TPL ... the limiting factor is reading / writing to the queue ... inherently a serial operation. You can't send 8 messages at once (a computer with 8 cores) for example.

    I had a similar need (without using the System.Messaging namespace) and solved it with some help from a book I read "Parallel Programming with Microsoft.NET" by Campbell and Johnson.

    Check out their “parallel tasks” chapter and specifically the part of using a global queue that cooperates with per-thread local queues for work processing (i.e., the TPL) that use a “work stealing” algorithm to perform load balancing. I modeled my solution, in part, after their example. The final version of my system had a huge difference in its performance (from 23 messages per second to over 200).

    Depending on how long it takes your system to go from 0 to the 80,000, you’ll want to take the same design and spread it across multiple servers (each with multiple processors and multiple cores). In theory my setup would require a little less than 7 minute to polish off all 80K, so by adding a 2nd computer it would cut that down to about ~3 minutes and 20 seconds, etc., etc., etc. The trick is the work stealing logic.

    Food for thought …

    A quick edit: BTW the computer is a Dell T7500 workstation with dual quad core Xeons @ 3GHz, 24 GB of RAM, Windows 7 Ultimate 64-bit edition.