java scala apache-spark apache-kafka apache-kafka-streams

how to process data in chunks/batches with kafka streams?

For many situations in Big Data it is preferable to work with a small buffer of records at a go, rather than one record at a time.

The natural example is calling some external API that supports batching for efficiency.

How can we do this in Kafka Streams? I cannot find anything in the API that looks like what I want.

So far I have:

builder.stream[String, String]("my-input-topic")
.mapValues(externalApiCall).to("my-output-topic")

What I want is:

builder.stream[String, String]("my-input-topic")
.batched(chunkSize = 2000).map(externalBatchedApiCall).to("my-output-topic")

In Scala and Akka Streams the function is called grouped or batch. In Spark Structured Streaming we can do mapPartitions.map(_.grouped(2000).map(externalBatchedApiCall)).

Solution

Doesn't seem to exist yet. Watch this space https://issues.apache.org/jira/browse/KAFKA-7432

Keep numbers which appear in both columns, in J lang
Differentiation in J
How to reshape an array with an arbitrary size in one dimension?
Why is Insert (fold) right associative
Write 4 : 'x&{.&.;: y' tacitly
Alignment issue when printing formatted prime numbers in J language
How can I define a verb in J that applies a different verb alternately to each atom in a list?
How to get user input in the J programming language
How to unbox a list of boxed lists of differing lengths in J?
How can I fix 'noun result was required' error in J?
In j, how can I define a verb locally in one scope and pass it to a defined adverb?
Convert boxed array to normal array?
Read column of CSV file as array
Replace atom in array of strings
What does the dyad `=` do to boxed strings?
Index of minimum element using J
How can I take the outer product of string vectors in J?
Building an array of verbs in J
Reading in multidigit command line parameter
Amend with bond to new data shows unexpected behaviour
How to turn a table or matrix into a (flat) list in J
How to run dissect in J?
How to define selection using index function in J
How to exit the J console?
Find 4-neighbors using J
Writing custom verbs in J
How do I negate a selector in J lang?
How to use arbitrary selector in interchange in J lang?
different result once square root is added inside tacit
Sum of arrays with repeated indices