Search code examples
apache-flinkflink-streaming

Triggering ProcessWindowFunction of apache flink based on a list of key


I have a strange requirement, I am using Apache Flink to process a data stream of a Kafka source. I want to do a stateful processing and keep a global state of all the processed keys between the windows. My requirement is to have an output for all the keys per window whether the key is present in the current window data or not. how do achieve this? I think the window will be triggered per each partition of input data.

    SingleOutputStreamOperator<OutputPojo> output =  inputDataStream
            .keyBy( inputPojo-> inputPojo.getKey() )
            .window(TumblingEventTimeWindows.of(Time.minutes(windowDuration)))
            .sideOutputLateData(outputTag)
            .process(new WindowProcess());

Solution

  • Flink's window API isn't designed to support this, and there's no easy way to make this happen.

    I would implement this with a KeyedProcessFunction. Fortunately there's an example in the Flink docs that will get you most of the way there: https://nightlies.apache.org/flink/flink-docs-stable/docs/learn-flink/event_driven/