Search code examples
jmsproducer-consumerapache-storm

Storm as a replacement for Multi-threaded Consumer/Producer approach to process high volumes?


We have a existing setup where upstream systems send messages to us on a Message Queue and we process these messages.The content is xml and we simply unmarshal.This unmarshalling step is followed by a write to db (to put relevant values onto relevant columns). The system is set to interface with many more upstream systems and our volumes are going to increase to a peak size of 40mm per day.

Our current way of processing is have listeners on the queues and then have a multiple threads of producers and consumers which do the unmarshalling and subsequent db write.

My question : Can this process fit into the Storm use case scenario? I mean can MQ be my spout and I have 2 bolts one to unmarshal and this then becomes the spout for the next bolt which does the write to db?

If yes,what is the benefit that I can derive? Is it a goodbye to cumbersome multi threaded producer/worker pattern of code. If its as simple as the above then where/why would one want to resort to the conventional multi threaded approach to producer/consumer scenario My point being is there a data volume/frequency at which Storm starts to shine when compared to the conventional approach.

PS : I'm very new to this and trying to get a hang of this and want to ascertain if the line of thinking is right

Regards, CVM


Solution

  • Definitely this scenario can fit into a storm topology. The spouts can pull from MQ and the bolts can handle the unmarshalling and subsequent processing.

    The major benefit over conventional multi threaded pattern is the ability to add more worker nodes as the load increases. This is not so easy with traditional producer consumer patterns.

    Specific data volume number is a very broad question since it depends on a large number of factors like hardware etc.