Search code examples
apache-flink

Does Apache Flink support using HTTP requests as input and output?


As a toy example, say that I would want to build a web application that received HTTP requests with an ID and a $ amount and returned a response with total $ amount that ID has payed over a 24h window. Using Apache Flink it's certainly possible to create that app if the input and output is, say, a Kafka stream.

But is it supported / possible to create a Flink app where the input is HTTP requests and the output is a response to the HTTP request?

If this is not supported by Flink, is there another data streaming framework that would support this?

(For the toy example above, I imagine there are more simple solutions than using Flink, but my real usecase is many, multiple windows and other stateful computations over an HTTP "stream" of requests.)


Solution

  • Flink doesn't provide official HTTP sources or sinks but you could implement your own based on the Sink and Source base classes. However I don't think this is the 'correct' approach, Flink is not designed to receive HTTP requests and answer them directly, Flink normally uses some kind of persistent data source/sinks so it can recreate/recalculate the state if the application fails.

    I recommend you use something like Kafka Bridge: https://strimzi.io/blog/2019/07/19/http-bridge-intro/, it allows HTTP clients to write and consume messages from a Kafka topic using simple requests. In this scenario you would have clients posting amount and ID data to a topic, then Flink would use this topic as an input and output results to a second topic. Finally your client can make a second request to poll the results:

    client -> KafkaBridge -> InputTopic -> Flink -> ResultsTopic

    client -> KafkaBridge -> ResultsTopic