Search code examples
apache-kafkahttprequesthttpresponseapache-kafka-streams

Sending Http Request with Kafka Stream


I am aware that it's not recommended to send http request with kafka stream as the blocking nature of external RPC calls may impact performance.

However what if the use case doesn't allow me to avoid sending http request?

I'm building an application that consumes from an input topic, then for each message it will go thorough various iterations of filtering, mapping, and joining with kTable. At the end of these iterations the result is ready to be "delivered".

Apparently, the "delivery" method is via http request. I have to call external rest APIs to send these result to various vendors. I will also need to wait for the response to come back and based on the result I will mark the delivery as either successful or failed and produce the result to an output topic, which will be consumed by other service for archiving purpose.

I'm aware that http calls will block the currently calling stream thread, so I configured a timeout which is strictly and greatly less than the consumer's max.poll.interval.ms to avoid rebalance in case the external API service is down. Also the timed out request will be sent to a low priority queue to be ready for delivery re-attempt in later times.

As you can see, I cannot really avoid making external RPC calls within kafka streams. I'm just curious if there is better architecture that's meant for such use case?


Solution

  • If you cannot avoid it, then one other option is to send data to some outbound "request" topic, and write a consumer to do the requests, and produce back to a "response" topic with HTTP status codes or success/fail indicators, for example.

    Then have Streams also consume this response topic for the joining.

    The main reason not to do blocking RPC within Streams is that it's very sensitive to time, and increasing timeouts excessively should generally be avoided when possible.