Search code examples
apache-kafkakafka-producer-apiakka-streamalpakka

Benefits of Alpakka Kafka Producer over plain Kafka Producer


I have an application that receives an api request and relays its to a Kafka API Producer. Each request calls the producer to send a message to Kafka. The producer exists throughout the application lifetime and is shared for all requests.

producer.send(new ProducerRecord[String, String](topic, requestBody))

This works OK. Now I want to use instead, an alpakka Producer for the job. The code looks like this:

val kafkaProducer = producerSettings.createKafkaProducer()
val settingsWithProducer = producerSettings.withProducer(kafkaProducer)

val done = Source.single(requestBody)
  .map(value => new ProducerRecord[String, String](topic, value))
  .runWith(Producer.plainSink(settingsWithProducer)) 

What are the advantages of the alpakka Producer over the plain, vanilla Producer? I don't know whether the new approach can help me handle a large number of API requests in order at the same time.


Solution

  • For that case of producing a single message to a Kafka topic, the Alpakka Producer sink that you're using doesn't really offer a benefit (the only tangential one might be if you're interested in using Akka Discovery to discover your Kafka brokers). Alpakka's SendProducer might be useful in your Scala code for that case: it exposes a Scala Future instead of a Java Future.

    Where the Alpakka Producer sinks and flows shine is in a stream context where there's a sequence of elements that you want produced in order with backpressure, especially if the messages to be produced are the output of a complex stream topology.

    I'm taking "large number of API requests" to mean HTTP/gRPC requests coming into your service and each request resulting in producing at most one message to Kafka. You can contort such a thing into a stream (e.g. feeding a stream via a Source.actorRef), but that's probably getting over-elaborate.

    As for "in order at the same time": that's kind of a contradiction, as "in order" somewhat rules out simultaneity. Are you thinking of a situation where you can partition the requests and then you want ordering within that partition of requests, but are OK with any ordering across partitions (note that I'm not necessarily implying anything about partitioning of the Kafka topic you're producing to)? In that case, Akka Streams (and likely actors) will come in handy and the Producer sinks/flows will likely come in handy.