I am grouping input data based on key, then doing 1 minute window with 30 seconds hop and in aggregator.
Data is being sent and consumed by an application, and the need for this application could evolve in the future, therefore, I see a need for future flexibility and quick change.
The current logic is described below:
@StreamListener("input")
public void process(KStream<String, Data> DataKStream) {
JsonSerde<DataAggregator> DataJsonSerde =
new JsonSerde<>(DataAggregator.class);
DataKStream
.groupByKey()
.windowedBy(TimeWindows.of(60000).advanceBy(30000))
.aggregate(
DataAggregator::new,
(key, Data, aggregator) -> aggregator.add(Data),
Materialized.with(Serdes.String(), DataJsonSerde)
);
}
DataAggregator.java
public class DataAggregator {
private List<String> dataList = new ArrayList<>();
public DataAggregator add(Data data) {
dataList.add(data.getId());
System.out.println(dataList);
return this;
}
public List<String> getDataList() {
return dataList;
}
}
However, given evolving requirements I would like to give users the possibility to change the logic via a menu.
For example, users could change the window at wish or change the way data is segregated.
I was potentially thinking of writing several java classes which could be turned on and off when users pick specific options.
But I am wondering if something better and more dynamic could be done.
With Flink, some things can't be changed while a job is running -- notable, the topology of the job graph, and the parallelism of the operators.
On the other hand, a control stream can be broadcast throughout the cluster to effect dynamic changes to the business logic. In simple cases this has been used to modify filter parameters; in more complex cases it has been used, for example, to trigger dynamic loading of code or machine learning models (e.g., by broadcasting PMML) used in transformations.
Sample use cases: RBEA: Scalable Real-Time Analytics at King, StreamING models, how ING adds models ... .
What's less obvious is how dynamically reconfigure aggregations. The open source Fraud Detection Demo (part 1, part 2, github) illustrates how to accomplish that.
For another example, see Cogynt: Flink without code.