I've stumbled upon this article that purports do contrast Samza with Storm, but it seems only to address implementation details.
Where do these two distributed computation engines differ in their use cases? What kind of job is each tool good for?
The biggest difference between Apache Storm and Apache Samza comes down to how they stream data to process it.
Apache Storm conducts real-time computation using topology and it gets feed into a cluster where the master node distributes the code among worker nodes that execute it. In topology data is passed in between spouts that spit out data streams as immutable sets of key-value pairs.
Here's Apache Storm's architecture:
Apache Samza streams by processing messages as they come in one at a time. The streams get divided into partitions that are an ordered sequence where each has a unique ID. It supports batching and is typically used with Hadoop's YARN and Apache Kafka.
Here's Apache Samza's architecture:
Read more about the specific ways each of the systems executes specifics below.
USE CASE
Apache Samza was created by LinkedIn.
A software engineer wrote a post siting:
Resources Used: