Search code examples
apache-stormstream-processingapache-apexbigdata

How Apache Apex is different from Apache Storm?


Apache Apex looks similar to Apache Storm.

  • Users build application/topology as Directed Acyclic Graph (DAG) on both platforms. Apex uses operators/streams and Storm uses spouts/streams/bolts.
  • They both process data in real time as opposed to batch processing.
  • Both seem to have high throughput & low latency

So, at a glance, both look similar and I'm not quite getting the difference. Can someone please explain what are the key differences? In other words, when should I use one instead of the other?


Solution

  • There are fundamental differences in architecture which make each of the platform very different in terms of latency, scaling and state management.

    At the very basic level,

    1. Apache Storm uses record acknowledgement to guarantee message delivery.
    2. Apache Apex uses checkpointing to guarantee message delivery.

    You can learn more differences in the following blog which also includes other main stream processing platforms out there.

    https://databaseline.wordpress.com/2016/03/12/an-overview-of-apache-streaming-technologies/