java multithreading servlets jms sonicmq

How to read and collect response messages from JMS and make them available to the correct servlet thread?

We've got a synchronous HTTP request-response front end for an asynchronous JMS messaging system.

A HTTP query servlet for each HTTPRequest creates a corresponding JMS message in a query queue. This query is processed by a back-end, and a couple response messages are created for this query. What is a good way to organize the receipt of the response messages in JMS, and to make sure they reach the proper servlet thread so that it can formulate the HTTPResponse?

The queries and responses are non-transactional and need not be persisted. Most of them are read queries. If no response gets read within 45 seconds, the servlet generates a timeout response. Throughput is important, however. We need to process an ever increasing amount of queries. The system is about ten years old and will have to stay up and running for two more years or so.

We're using SonicMQ. We have created one queue for all responses. The servlet container has one connection to the broker which it uses for both reading and writing. We spawn one listener thread per logged-in user (about 1500 concurrently) This thread has a receiver with a message selector that only selects response messages for this particular user. Once a servlet thread has sent its query message, it waits for the user's listener thread to notify it that it's read a response.

We used to have one single QueueSession shared by all senders and all receivers. This actually worked(!) though the session officially is not thread safe. Creating one QueueSession per thread (the servlet threads and the listener threads) improved the performance somewhat but things still aren't too stable and we'd like to organize things better.

I've tried creating a temp queue per user session instead of one single queue with message selectors but that slowed things down considerably.

What would be a better/the proper way to organize this?

Solution

I started this as a comment, but it kind of grew.

From your description, it sounds like every request will need at least two threads and possibly more. If you're already at 1500 concurrent users and your queries are enough work that you needed to farm them out to other nodes, I'd say you're already well into dangerous territory as far as how many active threads will run effectively in a JVM without hefty CPU/memory allocation and some serious tweaking of settings.

My comment about removing JMS was because from the servlet side of your app, you're just doing a lot of extra work to make JMS into a synchronous request/response mechanism when a simple thread pool would serve just as well for being able to run multiple concurrent queries in response to an HTTP request. It sounds like JMS is a decent way for the back end to receive work requests, though, so a major rewrite probably isn't merited.

I think a better way to organize this would be a set of consumers per tomcat instance instead of a consumer per request thread. Each webhead can have its own response queue, or use MessageSelectors on a single queue. Then when a request comes in, send a request JMS message and leave a way for the consumer to call back to the calling thread, like a SynchronousQueue that the caller is waiting to take() from. If you have to wait on multiple messages to serve a single request, maybe combine a ConcurrentLinkedQueue to drop the responses on with a CountdownLatch to signal the requesting thread when the responses are all received. This way, you can have a relatively small pool of threads responsible for receiving messages when they come in. I feel like there should be something out there that will help you solve this problem, but I can't think of anything off-hand.

After that, if you still find performance to be a problem, you can scale by adding tomcat instances or look into non-blocking IO for HTTP request handling, applying the same strategy to the front door as to the back: use a small pool of threads to handle a large request volume, where waiting is tying up a lot of threads in a thread-per-request model.