Applying Multiple Filters in Parallel Flink DataStream

I would like to run a Flink Streaming application that works on the read-once-write-many analogy. Basically, I would like to read from a firehose, apply different filters in parallel for reach record read from source, and send them to different sinks based on configuration.

How is this possible to do in Flink? I think in KafkaStreams there is a concept where you can do this. Below is an illustration of how I want my Flink DAG to look like:

Solution

The most simple way to accomplish this would be through the use of the filter() transformation as demonstrated below:

DataStream<X> stream = ...;

// Filter 1
stream
    .filter { x -> x.property == 1 }
    .sinkTo(sink1)

// Filter 2
stream
    .filter { x -> x.property == 2 }
    .sinkTo(sink2)

// Repeat ad nauseum

Alternatively, you could use consider using Side Outputs such that you'd only require a single "filter" function which could handle separating each of your filtered streams into separate outputs that you could then act upon.

If you absolutely require n different streams, you might have to look at the type of sink that you intend to write to and consider handling the filtering at that level