Search code examples
apache-flinkflink-streamingflink-sqlflink-cepmatch-recognize

Is Flink's Match_Recognize function suitable for capturing this type of pattern?


I'm trying to catch events in pattern described below:

  • Start event = SalePackageA event (Customer A purchasing PackageA)
  • 2-nd event = PackageUsage event (Customer A uses PackageA)
  • 3-rd event = PackageUsage event (Customer A uses PackageA)
  • 4-th event = PackageUsage event (Customer A uses PackageA)
  • ...
  • N-th event = PackageUsage event (Customer A uses PackageA)
  • Stop event = SalePackageA event (Customer A buys PackageA again)

Ie: customer purchased some data package with 2048mb balance, then customer using it - I receive used bytes in every PackageUsage event.

So, match_recognize should shout on every PackageUsage event with some aggregation logic:

( SalePackageA.Initial_Balance_Bytes - sum(present_event__PackageUsage.usage_bytes + sum(all_previous__PackageUsages.usage_bytes)) ) as Remaining_Balance

And when the same Customer purchases the same Package, this "flow" should be interrupted and a new "flow" will start over.

Is Flink's CEP suitable for described case? Any ideas/suggestions how to implement this using CEP?


Solution

  • MATCH_RECOGNIZE and CEP aren't a good match for your requirements (because you need to report the remaining balance after every usage event).

    My suggestion is to implement this with a keyed process function.