Search code examples
apache-flinkflink-streamingflink-sqlpyflink

when to use Temporary table or permanent table in Flink


New to Flink, I am building a simple aggregation pipeline, e.g. sales amount each day. I am using table api. I see that there are two options creating a table: temporary and permanent. For permanent table, we also need to setup a catalog, e.g. HIVE. So I am inclined to use temporary table, which is easy to get started. But curious what is good and bad about each other.

Based on the doc, the temporary table does not survive when the Flink job stops. Then what would happen if we make a Flink Job deployment for bug fixes.

Thanks!


Solution

  • A table does not store your data, but instead stores the metadata, i.e., the table’s name and location. E.g., in the case of a table backed by Kafka, the broker’s address and topic name.

    It’s fine to use temporary tables. But if you want to share this metadata with other applications, then it’s convenient to store it in a catalog and use permanent tables.