My requirement is to calculate based on an incremental size window for a batch table.
For example, the first window has 1 row, the second window has 2 rows(including 1 row from the 1st window and a new row), then 3 rows in the 3rd window(including 2 rows from the 2nd window and a new row), and so on.
For example:
datetime | productId | price |
3-1 | p1 | 10 |
3-2 | p1 | 20 |
3-3 | p1 | 30 |
3-4 | p1 | 40 |
datetime | productId | average|
3-1 | p1 | 10/1 |
3-2 | p1 | (10+20)/2 |
3-3 | p1 | (10+20+30)/3 |
3-4 | p1 | (10+20+30+40)/4 |
I am trying to find a way to implement this requirement with Sql, to me seems the OVER action can do that but not yet implemented in flink, so I need an alternative way.
BTW:
I tried to use a TUMBLE window of 1 day and store the previous value in the user defined aggregation object but failed as the aggregation object will be reused by all product not a single object for each product
The OVER clause on a batch table is not supported by Flink's SQL yet. You can track the status of this effort here.
However, did you consider to implement this behavior on a streaming table instead? Streaming tables can also read from static files such as CSV files and many operations are supported there as well. This depends on the other operations you want to use in your query, though.