Search code examples
message-queueibm-mqmq

Message queuing solution for millions of topics


I'm thinking about system that will notify multiple consumers about events happening to a population of objects. Every subscriber should be able to subscribe to events happening to zero or more of the objects, multiple subscribers should be able to receive information about events happening to a single object.

I think that some message queuing system will be appropriate in this case but I'm not sure how to handle the fact that I'll have millions of the objects - using separate topic for every of the objects does not sound good [or is it just fine?].

Can you please suggest approach I should should take and maybe even some open source message queuing system that would be reasonable?

Few more details:

  • there will be thousands of subscribers [meaning not plenty of them],
  • subscribers will subscribe to tens or hundreds of objects each,
  • there will be ~5-20 million of the objects,
  • events themselves dont have to carry any message. just information that that object was changed is enough,
  • vast majority of objects will never be subscribed to,
  • events occur at the maximum rate of few hundreds per second,
  • ideally the server should run under linux, be able to integrate with the rest of the ecosystem via http long-poll [using node js? continuations under jetty?].

Thanks in advance for your feedback and sorry for somewhat vague question!


Solution

  • Break up the topics to carry specific events for e.g. "Object updated topic" "Object deleted"...So clients need to only have to subscribe to the "finite no:" of event based topics they are interested in.

    Inject headers into your messages when you publish them and put intelligence into the clients to use these headers as message selectors. For eg, client knows the list of objects he is interested in - and say you identify the object by an "id" - the id can be the header, and the client will use the "id header" to determine if he is interested in the message.

    Depending on whether you want, you may also want to consider ensuring guaranteed delivery to make sure that the client will receive the message even if it goes off-line and comes back later.

    The options that I would recommend top of the head are ActiveMQ, RabbitMQ and Redis PUB SUB ( Havent really worked on redis pub-sub, please use your due diligance)

    Finally here are some performance benchmarks for RabbitMQ and Redis

    Just saw that you only have few 100 messages getting pushed out / sec, this is not a big deal for activemq, I have been using Amq on a system that processes 240 messages per second , and it just works fine. I use a thread pool of workers to asynchronously process the messages though . Look at a framework like akka if you are in the java land, if not stick with nodejs and the cool Eco system around it.