Search code examples
apache-flinkakka-stream

Akka streams vs Apache Flink


While exploring Akka streams, I also came across Apache Flink which stream processing engine. Akka streams implements reactive streams and supports back pressure.

So if I have to make decision between two, which one should I go for? How do they differ and whats the similarity? What should be the criteria here?


Solution

  • I am not an expert in Akka Streams, but as far as I know, the main difference is that Flink offers the distribution of processing out of the box, while Akka Streams does not, since it was designed to process data on a single node.

    The similarity between the two is that they both offer stream processing capabilities and in this sense, they probably have similar functionality.

    But, Flink has multiple additional modules like SQL, CEP, or Machine Learning that You won't be able to get in Akka Streams. Also, Flink provides fail-safety and state recovery, which I am not sure if is present in Akka Streams out of the box.

    On the other hand, setting up Akka Streaming will require less work as You don't need to care about setting JobManager & TaskManager but You can simply create a Java/Scala application, dockerize & run it somewhere.

    So, the main question You should ask Yourself is, if the data You are processing is big enough that it will need to be processed on multiple nodes if it is then You really have no choice other than Flink (just in scenario Akka Streams vs. Flink). If however, the data You are going to process can be processed on a single node, then You should assess the fail-safety & message delivery guarantees You need. In the general case scenario, using Akka Streams may be easier to start with, but Flink may take over when it comes to productionizing the app.