Flink 1.13.2 (Flink SQL) on Yarn.
A bit confused - I found two (as I understand) different specifications of Filesystem connector (Ververica.com vs ci.apache.org):
https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/overview/#supported-connectors — Filesystem is "Bounded and Unbounded Scan, Lookup"
https://docs.ververica.com/user_guide/sql_development/connectors.html#packaged-connectors — Only JDBC marked as usable for Lookup.
Can I use Filesystem connector (csv) for creating lookup (dimension) tables to enrich Kafka events table? If yes - how it's possible using Flink SQL?
(I've tried simple left joins with FOR SYSTEM_TIME AS OF a.event_datetime
- it's works in test environment with small amount of Kafka events, but in production I get GC overhead limit exceeded
error. I guess that's because of not broadcasting small csv tables to worker nodes. In Spark I used to solve these type problems using related hints.)
The lookup(dimension) table needs to implement the LookupTableSource interface, currently only hbase, jdbc, and hive are implemented in the Flink 1.3 version