Search code examples
apache-flinkflink-sql

Apache Flink: Is Table API state scalable?


According to Flink Table API Streaming Concepts, Table API and SQL queries can fail due to growing state size.

State Size: Continuous queries are evaluated on unbounded streams and are often supposed to run for weeks or months. Hence, the total amount of data that a continuous query processes can be very large. Queries that have to update previously emitted results need to maintain all emitted rows in order to be able to update them. For instance, the first example query needs to store the URL count for each user to be able to increase the count and sent out a new result when the input table receives a new row. If only registered users are tracked, the number of counts to maintain might not be too high. However, if non-registered users get a unique user name assigned, the number of counts to maintain would grow over time and might eventually cause the query to fail.

The Table API and SQL are using the DataStream API under the hood.

Shouldn't the state of Table API / SQL queries scale just like state of DataStream API jobs?


Solution

  • You are correct in thinking that Flink's Table API is just as scalable as the DataStream API. Still, any given infrastructure has finite capacity, and a Flink job written so that it uses unbounded state will eventually crash once it has consumed all available resources. Some Flink users process petabytes of data every day and expect their jobs to run for weeks and months on end, which is only possible by paying attention to such issues.