Search code examples
apache-flinkflink-streaming

Apache Flink: is .countWindow() the proper way to process code after a number of events occur?


On Apache Flink 1.8.1

I was reading https://flink.apache.org/news/2015/12/04/Introducing-windows.html and I'm trying to figure out if using countWindow(size: Int) is suitable for my use-case: when N users visit the Help page of my website, I'd like to flag them all for customer service outreach. I'm confused because there's limited documentation on .countWindow(), and I'm having a hard time confirming if it's been deprecated in favor of another approach.

I was dabbling in this (ignore the red highlights): enter image description here

The source code for countWindow() shows the following: enter image description here

Then finding documentation was so limited. At best, I found it listed in 1.3 docs, but my IDE doesn't indicate that it's been deprecated. Then there's this, which seems unrelated: https://ci.apache.org/projects/flink/flink-docs-release-1.9/api/java/org/apache/flink/table/runtime/operators/window/CountWindow.html

Am i going down the wrong rabbit hole, or is there a better way to Flink for my specific edge case above?


Solution

  • It has not been deprecated, what You have posted is an operator that can be used in Table API, so it's not directly connected to the Streaming API that You are using according to the code.

    The code You have pasted shows the correct usage of coun windowing.

    As for the question of whether it's a good idea to use Count window to do this - this depends.

    Technically this should work, but if You have many users that visit only one page then this would create a lot of windows that are not closed so You should take care of this Yourself.

    Generally, it should be easier to do this by using KeyedProcessFunction and ValueState that would keep the number of visits.